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Abstract 

The paper studies the problem of distributed average consensus in sensor networks with quantized data and random 
link failures. To achieve consensus, dither (small noise) is added to the sensor states before quantization. When the 
quantizer range is unbounded (countable number of quantizer levels), stochastic approximation shows that consensus is 
asymptotically achieved with probability one and in mean square to a finite random variable. We show that the mean- 
squared error (m.s.e.) can be made arbitrarily small by tuning the link weight sequence, at a cost of the convergence 
rate of the algorithm. To study dithered consensus with random links when the range of the quantizer is bounded, we 
establish uniform boundedness of the sample paths of the unbounded quantizer. This requires characterization of the 
statistical properties of the supremum taken over the sample paths of the state of the quantizer. This is accomplished 
by splitting the state vector of the quantizer in two components: one along the consensus subspace and the other 
along the subspace orthogonal to the consensus subspace. The proofs use maximal inequalities for submartingale and 
supermartingale sequences. From these, we derive probability bounds on the excursions of the two subsequences, 
from which probability bounds on the excursions of the quantizer state vector follow. The paper shows how to use 
these probability bounds to design the quantizer parameters and to explore tradeoffs among the number of quantizer 
levels, the size of the quantization steps, the desired probability of saturation, and the desired level of accuracy e 
away from consensus. Finally, the paper illustrates the quantizer design with a numerical study. 
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I. Introduction 

This paper is concerned with consensus in networks, e.g., a sensor network, when the data exchanges among 
nodes in the network (sensors, agents) are quantized. Before detailing our work, we briefly overview the literature. 

Literature review. Consensus is broadly understood as individuals in a community achieving a consistent view 
of the World by interchanging information regarding their current state with their neighbors. Considered in the early 
work of Tsitsiklis et. al. ([1], [2]), it has received considerable attention in recent years and arises in numerous 
applications including: load balancing, [3], alignment, flocking, and multi-agent collaboration, e.g., [4], [5], vehicle 
formation, [6], gossip algorithms, [7], tracking, data fusion, [8], and distributed inference, [9]. We refer the reader 
to the recent overviews on consensus, which include [10], [11]. 

Consensus is a distributed iterative algorithm where the sensor states evolve on the basis of local interactions. 
Reference [5] used spectral graph concepts like graph Laplacian and algebraic connectivity to prove convergence 
for consensus under several network operating conditions (e.g., delays and switching networks, i.e., time varying). 
Our own prior work has been concerned with designing topologies that optimize consensus with respect to the 
convergence rate, [12], [9]. Topology design is concerned with two issues: 1) the definition of the graph that specifies 
the neighbors of each sensor — i.e., with whom should each sensor exchange data; and 2) the weights used by the 
sensors when combining the information received from their neighbors to update their state. Reference [13] considers 
the problem of weight design, when the topology is specified, in the framework of semi-definite programming. 
References [14], [15] considered the impact of different topologies on the convergence rate of consensus, in 
particular, regular, random, and small-world graphs, [16]. Reference [17] relates the convergence properties of 
consensus algorithms to the effective resistance of the network, thus obtaining convergence rate scaling laws for 
networks in up to 3-dimensional space. Convergence results for general problems in multi-vehicle formation has been 
considered in [18], where convergence rate is related to the topological dimension of the network and stabilizability 
issues in higher dimensions are addressed. Robustness issues in consensus algorithms in the presence of analog 
communication noise and random data packet dropouts have been considered in [19]. 

Review of literature on quantized consensus. Distributed consensus with quantized transmission has been studied 
recently in [20], [21], [22], [23] with respect to time-invariant (fixed) topologies. Reference [24] considers quantized 
consensus for a certain class of time-varying topologies. The algorithm in [20] is restricted to integer-valued initial 
sensor states, where at each iteration the sensors exchange integer-valued data. It is shown there that the sensor states 
are asymptotically close (in their appropriate sense) to the desired average, but may not reach absolute consensus. 
In [21], the noise in the consensus algorithm studied in [25] is interpreted as quantization noise and shown there by 
simulation with a small network that the variance of the quantization noise is reduced as the algorithm iterates and the 
sensors converge to a consensus. References [22], [26] study probabilistic quantized consensus. Each sensor updates 
its state at each iteration by probabilistically quantizing its current state (which [27] claims equivalent to dithering) 
and linearly combining it with the quantized versions of the states of the neighbors. They show that the sensor 
states reach consensus a.s. to a quantized level. In [23] a worst case analysis is presented on the error propagation 
of consensus algorithms with quantized communication for various classes of time-invariant network topologies, 
while [28] addresses the impact of more involved encoding/decoding strategies, beyond the uniform quantizer. The 
effect of communication noise in the consensus process may lead to several interesting phase transition phenomena 
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in global network behavior, see, for example, [29] in the context of a network of mobile agents with a non- 
linear interaction model and [30], which rigorously establishes a phase transition behavior in a network of bipolar 
agents when the communication noise exceeds a given threshold. Consensus algorithms with general imperfect 
communication (including quantization) in a certain class of time-varying topologies has been addressed in [24], 
which assumes that there exists a window of fixed length, such that the union of the network graphs formed 
within that window is strongly connected. From a distributed detection viewpoint, binary consensus algorithms 
over networks of additive white Gaussian noise channels were addressed in [31], which proposed soft information 
processing techniques to improve consensus convergence properties over such noisy channels. The impact of fading 
on consensus is studied in [32]. 

Contributions of this paper. We consider consensus with quantized data and random inter-sensor link failures. 
This is useful in applications where limited bandwidth and power for inter-sensor communications preclude ex- 
changes of high precision (analog) data as in wireless sensor networks. Further, randomness in the environment 
results in random data packet dropouts. To handle quantization, we modify standard consensus by adding a small 
amount of noise, dither, to the data before quantization and by letting the consensus weights to be time varying, 
satisfying a persistence condition-their sum over time diverges, while their square sum is finite. We will show that 
dithered quantized consensus in networks with random links converges. 

The randomness of the network topology is captured by assuming that the time-varying Laplacian sequence, 
{L(i)}i>o, which characterizes the communication graph, is independent with mean L; further, to prove convergence, 
we will need the mean graph algebraic connectivity (first nonzero eigenvalue of L) \2(L) > 0, i.e., the network to 
be connected on the average. Our proofs do not require any distributional assumptions on the link failure model (in 
space). During the same iteration, the link failures can be spatially dependent, i.e., correlated across different edges 
of the network. The model we work with in this paper subsumes the erasure network model, where link failures 
are independent both over space and time. Wireless sensor networks motivate us since interference among the 
sensors communication correlates the link failures over space, while over time, it is still reasonable to assume that 
the channels are memoryless or independent. Note that the assumption A 2 (L) > does not require the individual 
random instantiations of L(i) to be connected; in fact, it is possible to have all the instantiations to be disconnected. 
This captures a broad class of asynchronous communication models, for example, the random asynchronous gossip 
protocol in [33] satisfies A 2 [L) > and hence falls under this framework. 

The main contribution of this paper is the study of the convergence and the detailed analysis of the sample path 
of this dithered distributed quantized consensus algorithm with random link failures. This distinguishes our work 
from [20] that considers fixed topologies (no random links) and integer valued initial sensor states, while our initial 
states are arbitrarily real valued. To our knowledge, the convergence and sample path analysis of dithered quantized 
consensus with random links has not been carried out before. The sample path analysis of quantized consensus 
algorithms is needed because in practice quantizers work with bounded (finite) ranges. The literature usually pays 
thrift attention or simply ignores the boundary effects induced by the bounded range of the quantizers; in other 
words, although assuming finite range quantizers, the analysis in the literature ignores the boundary effects. Our 
paper studies carefully the sample path behavior of quantized consensus when the range of the quantizer is bounded. 
It computes, under appropriate conditions, the probability of large excursion of the sample paths and shows that 
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the quantizer can be designed so that with probability as close to 1 as desired the sample path excursions remain 
bounded, within an e-distance of the desired consensus average. Neither our previous work [19], which deals with 
consensus with noisy analog communications in a random network, nor references [22], [26], [27], which introduce 
a probabilistic quantized consensus algorithm in fixed networks, nor [34], which studies consensus with analog 
noisy communication and fixed network, study the sample path behavior of quantized consensus. Also, while the 
probabilistic consensus in [22], [26], [27] converges almost surely to a quantized level, in our work, we show that 
dithered consensus converges a.s. to a random variable which can be made arbitrarily close to the desired average. 

To study the a.s. convergence and m.s.s. convergence of the dithered distributed quantizers with random links 
and unbounded range, the stochastic approximation method we use in [19] is sufficient. In simple terms, we 
associate, like in [19], with the quantized distributed consensus a Lyapounov function and study the behavior of 
this Lyapounov function along the trajectories of the noisy consensus algorithm with random links. To show almost 
sure convergence, we show that a functional of this process is a nonnegative supermartingale; convergence follows 
from convergence results on nonnegative supermartingales. We do this in Section III where we term the unbounded 
dithered distributed quantized consensus algorithm with random links simply Quantized Consensus, for short, or 
QC algorithm. Although the general principles of the approach are similar to the ones in [19], the details are different 
and not trivial-we minimize the overlap and refer the reader to [19] for details. A second reason to go over this 
analysis in the paper for the QC algorithm is that we derive in this Section for QC several specific bounds that are 
used and needed as intermediate results for the sample path analysis that is carried out in Section IV when studying 
dithered quantized consensus when the quantizer is bounded, i.e., Quantized Consensus with Finite quantizer, the 
QCF quantizer. The QCF is a very simple algorithm: it is QC till the QC state reaches the quantizer bound, otherwise 
an error is declared and the algorithm terminated. To study QCF, we establish uniform boundedness of the sample 
paths of the QC algorithm. This requires establishing the statistical properties of the supremum taken over the 
sample paths of the QC. This is accomplished by splitting the state vector of the quantizer in two components: 
one along the consensus subspace and the other along the subspace orthogonal to the consensus subspace. These 
proofs use maximal inequalities for submartingale and supermartingale sequences. From these, we are able to derive 
probability bounds on the excursions of the two subsequences, which we use to derive probability bounds on the 
excursions of the QC. We see that to carry out this sample path study requires new methods of analysis that go 
well beyond the stochastic approximation methodology that we used in our paper [19], and also used by [34] to 
study consensus with noise but fixed networks. The detailed sample path analysis leads to bounds on the probability 
of the sample path excursions of the QC algorithm. We then use these bounds to design the quantizer parameters 
and to explore tradeoffs among these parameters. In particular, we derive a probability of e-consensus expressed 
in terms of the (finite) number of quantizer levels, the size of the quantization steps, the desired probability of 
saturation, and the desired level of accuracy e away from consensus. 

For the QC algorithm, there exists an interesting trade-off between the m.s.e. (between the limiting random 
variable and the desired initial average) and the convergence: by tuning the link weight sequence appropriately, 
it is possible to make the m.s.e. arbitrarily small (irrespective of the quantization step-size), though penalizing 
the convergence rate. To tune the QC-algorithm, we introduce a scalar control parameter s (associated with the 
time-varying link weight sequence), which can make the m.s.e. as small as we want, irrespective of how large 
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the step-size A is. This is significant in applications that rely on accuracy and may call for very small m.s.e. for 
being useful. More specifically, if a cost structure is imposed on the consensus problem, where the objective is a 
function of the m.s.e. and the convergence rate, one may obtain the optimal scaling s by minimizing the cost from 
the Pareto-optimal curve generated by varying s. These tradeoffs and vanishingly small m.s.e. contrasts with the 
algorithms in [20], [21], [22], [23], [24] where the m.s.e. is proportional to A 2 , the quantization step-size-if the 
step-size is large, these algorithms lead to a large m.s.e. 

Organization of the paper. We comment briefly on the organization of the main sections of the paper. Section II 
summarizes relevant background, including spectral graph theory and average consensus, and presents the dithered 
quantized consensus problem with the dither satisfying the Schuchman conditions. Sections III considers the 
convergence of the QC algorithm. It shows a.s. convergence to a random variable, whose m.s.e. is fully characterized. 
Section IV studies the sample path behavior of the QC algorithm through the QCF. It uses the expressions we derive 
for the probability of large excursions of the sample paths of the quantizer to consider the tradeoffs among different 
quantizer parameters, e.g., number of bits and quantization step, and the network topology to achieve optimal 
performance under a constraint on the number of levels of the quantizer. These tradeoffs are illustrated with a 
numerical study. Finally, Section V concludes the paper. 

II. Consensus with Quantized Data: Problem Statement 

We present preliminaries needed for the analysis of the consensus algorithm with quantized data. The set-up of 
the average consensus problem is standard, see the introductory sections of relevant recent papers. 

A. Preliminaries: Notation and Average Consensus 

The sensor network at time index i is represented by an undirected, simple, connected graph G(i) = (V, E(i)). 
The vertex and edge sets V and E(i), with cardinalities \V\ = N and \E(i)\ = M(i), collect the sensors and 
communication channels or links among sensors in the network at time i. The network topology at time i, i.e., with 
which sensors does each sensor communicate with, is described by the N x N discrete Laplacian L(i) = L T (i) = 
D(i) — A(i) > 0. The matrix A{i) is the adjacency matrix of the connectivity graph at time i, a (0, 1) matrix 
where A n k(i) = 1 signifies that there is a link between sensors n and k at time i. The diagonal entries of A(i) are 
zero. The diagonal matrix D(i) is the degree matrix, whose diagonal D nn (i) = d n (i) where d n (i) is the degree 
of sensor n, i.e., the number of links of sensor n at time i. The neighbors of a sensor or node n, collected in the 
neighborhood set £l n (i), are those sensors k for which entries A nk (i) ^ 0. The Laplacian is positive semidefinite; 
in case the network is connected at time i, the corresponding algebraic connectivity or Fiedler value is positive, i.e., 
the second eigenvalue of the Laplacian X 2 (L(i)) > 0, where the eigenvalues of L(i) are ordered in increasing order. 
For detailed treatment of graphs and their spectral theory see, for example, [35], [36], [37]. Throughout the paper 
the symbols P[-] and E[-] denote the probability and expectation operators w.r.t. the probability space of interest. 

Distributed Average Consensus. The sensors measure the data x„(0), n — 1, • • • ,N, collected in the vector 
x(0) = [xi(0) • • -xn(0)} T S M jVxl . Distributed average consensus computes the average r of the data 



r 




(1) 
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by local data exchanges among neighboring sensors. In (1), the column vector 1 has all entries equal to 1. Consensus 
is an iterative algorithm where at iteration i each sensor updates its current state x n (i) by a weighted average of 
its current state and the states of its neighbors. Standard consensus assumes a fixed connected network topology, 
i.e., the links stay online permanently, the communication is noiseless, and the data exchanges are analog. Under 
mild conditions, the states of all sensors reach consensus, converging to the desired average r, see [5], [13], 

lim x(i) = rl (2) 

i— >oo 

where x(i) = [x\(i) ■ ■ ■ xn(i)} is the state vector that stacks the state of the N sensors at iteration i. We consider 
consensus with quantized data exchanges and random topology (links fail or become alive at random times), which 
models packet dropouts. In [19], we studied consensus with random topologies and (analog) noisy communications. 

B. Dithered Quantization: Schuchman Conditions 

We write the sensor updating equations for consensus with quantized data and random link failures as 

x n {i + l) = [1 - a{i)d n {i)\ x n (i) + a(i) ^ fm,i M*)] . 1 < n < N 0) 

where: a(i) is the weight at iteration i; and {.f n i.i}i<n,i<N. i>o is a sequence of functions (possibly random) 
modeling the quantization effects. Note that in (3), the weights a(i) are the same across all links — the equal 
weights consensus, see [13] — but the weights may change with time. Also, the degree d n (i) and the neighborhood 
f2 n (i) of each sensor n, n = 1, • • • , N are dependent on i emphasizing the topology may be random time-varying. 

Quantizer. Each inter-sensor communication channel uses a uniform quantizer with quantization step A. We 
model the communication channel by introducing the quantizing function, q(-) : M — > Q, 

q(y) = kA, (fc-i)A<y< (fc+i)A (4) 

where y G R is the channel input. Writing 

q(y) = y + e(y) (5) 

where e(y) is the quantization error. Conditioned on the input, the quantization error e(y) is deterministic, and 

A A 

-^<e(y)<-, Vy (6) 
We first consider quantized consensus (QC) with unbounded range, i.e., the quantization alphabet 

Q = {fcA | k e Z} (7) 

is countably infinite. In Section IV. we consider what happens when the range of the quantizer is finite-quantized 
consensus with finite (QCF) alphabet. This study requires that we detail the sample path behavior of the QC- 
algorithm. 

We discuss briefly why a naive approach to consensus will fail (see [27] for a similar discussion.) If we use 
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directly the quantized state information, the functions f n is{-) in eqn. (3) are 



fni,%{xi{i)) = q(xi(i)) 



(8) 



= xi(i) + e(xi(i)) 



(9) 



Equations (3) take then the form 



x n (i + l) = (1 - a(i)d n (i))x n (i) + a(i) ^ xi(i) + a(i) ^ e(xf(i)) 



(10) 



The non-stochastic errors (the most right terms in (10)) lead to error accumulation. If the network topology remains 
fixed (deterministic topology,) the update in eqn. (10) represents a sequence of iterations that, as observed above, 
conditioned on the initial state, which then determines the input, are deterministic. If we choose the weights a(i)'s 
to decrease to zero very quickly, then (10) may terminate before reaching the consensus set. On the other hand, if 
the a(i)'s decay slowly, the quantization errors may accumulate, thus making the states unbounded. 

In either case, the naive approach to consensus with quantized data fails to lead to a reasonable solution. This 
failure is due to the fact that the error terms are not stochastic. To overcome these problems, we introduce in 

a controlled way noise (dither) to randomize the sensor states prior to quantizing the perturbed stochastic state. 
Under appropriate conditions, the resulting quantization errors possess nice statistical properties, leading to the 
quantized states reaching consensus (in an appropriate sense to be defined below.) Dither places consensus with 
quantized data in the framework of distributed consensus with noisy communication links; when the range of the 
quantizer is unbounded, we apply stochastic approximation to study the limiting behavior of QC, as we did in [19] 
to study consensus with (analog) noise and random topology. Note that if instead of adding dither, we assumed that 
the quantization errors are independent, uniformly distributed random variables, we would not need to add dither, 
and our analysis would still apply. 

Schuchman conditions. The dither added to randomize the quantization effects satisfies a special condition, 
namely, as in subtractively dithered systems, see [38], [39]. Let {y(i)}i>o and {v{i)}i>o t> e arbitrary sequences 
of random variables, and q(-) be the quantization function (4). When dither is added before quantization, the 
quantization error sequence, {e(i)}i>o, is 



If the dither sequence, {v{i)}i>o, satisfies the Schuchman conditions, [40], then the quantization error sequence, 
{e(i)}i>o, in (11) is i.i.d. uniformly distributed on [—A/2, A/2) and independent of the input sequence {y(i)}i>o 
(see [41], [42], [38]). A sufficient condition for to satisfy the Schuchman conditions is for it to be a sequence 

of i.i.d. random variables uniformly distributed on [— A/2, A/2) and independent of the input sequence {y{i)}i>o- 
In the sequel, the dither {v{i)}i>o satisfies the Schuchman conditions. Hence, the quantization error sequence, 
{e(i)}, is i.i.d. uniformly distributed on [—A/2, A/2) and independent of the input sequence {y(i)}i>o- 



£(*) = q(y(i) + v{i)) - (y(i) + v{i)) 



(11) 
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C. Dithered Quantized Consensus With Random Link Failures: Problem Statement 

We now return to the problem formulation of consensus with quantized data with dither added. Introducing the 
sequence, {v n l{i)}i>o,i<n,l<N , of i.i.d. random variables, uniformly distributed on [—A/2, A/2), the state update 
equation for quantized consensus is: 

x n {i + I) = (I - a(i)d n (i)) x n (i) + a(i) Y q [x t (i) + v nl {i)} , 1 < n < N (12) 

This equation shows that, before transmitting its state xi(i) to the n-th sensor, the sensor I adds the dither v n i(i), 
then the channel between the sensors n and I quantizes this corrupted state, and, finally, sensor n receives this 
quantized output. Using eqn. (11), the state update is 

x n (i + l) = {1 - a(i)d n ) x n (i) + a(i) Y [x t (i) + v rd (i) + £„i(i)\ (13) 

The random variables v n i(i) are independent of the state x(j), i.e., the states of all sensors at iteration j, for j < i. 
Hence, the collection {e n i(i)} consists of i.i.d. random variables uniformly distributed on [—A/2, A/2), and the 
random variable e n i(i) is also independent of the state x(j), j < i. 

We rewrite (13) in vector form. Define the random vectors, T(i) and \I>(i) e M JVxl with components 

Tn(») = - Y M») (14) 
ieo„(i) 

*n(*) = - Y £ "'W (15) 

The the N state update equations in (13) become in vector form 

x(i + 1) = x(i) - a(i) [L(*)x(i) + T(i) + *(*)] (16) 

where T(i) and are zero mean vectors, independent of the state x(i), and have i.i.d. components. Also, if |7W | 
is the number of realizable network links, eqns. (14) and (15) lead to 

M\ A 2 

E[||T(i)|| 2 ] =E[||*(z)|| 2 ] <^i— ,z>0 (17) 

Random Link Failures: We now state the assumption about the link failure model to be adopted throughout 
the paper. The graph Laplacians are 

L(i) = L + L(i), Vi>0 (18) 

where {L(i)}i>o is a sequence of i.i.d. Laplacian matrices with mean L — E [£(i)], such that A 2 (7T) > (we just 
require the network to be connected on the average.) We do not make any distributional assumptions on the link 
failure model. During the same iteration, the link failures can be spatially dependent, i.e., correlated across different 
edges of the network. This model subsumes the erasure network model, where the link failures are independent 
both over space and time. Wireless sensor networks motivate this model since interference among the sensors 
communication correlates the link failures over space, while over time, it is still reasonable to assume that the 
channels are memoryless or independent. We also note that the above assumption A 2 (I) > does not require the 
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individual random instantiations of L(i) to be connected; in fact, it is possible to have all the instantiations to be 
disconnected. This enables us to capture a broad class of asynchronous communication models, for example, the 
random asynchronous gossip protocol analyzed in [33] satisfies A 2 > and hence falls under this framework. 
More generally, in the asynchronous set up, if the sensors nodes are equipped with independent clocks whose ticks 
follow a regular random point process (the ticking instants do not have an accumulation point, which is true for 
all renewal processes, in particular, the Poisson clock in [33]), and at each tick a random network is realized with 
A2 {L) > independent of the the networks realized in previous ticks (this is the case with the link formation 
process assumed in [33]) our algorithm applies. 1 

We denote the number of network edges at time i as M(i), where M(i) is a random subset of the set of all 
possible edges £ with \£\ — N(N — l)/2. Let M denote the set of realizable edges. We then have the inclusion 

M(i) cMd£, Vi (19) 

It is important to note that the value of M(i) depends on the link usage protocol. For example, in the asynchronous 
gossip protocol considered in [33], at each iteration only one link is active, and hence M(i) = 1. 

Independence Assumptions: We assume that the Laplacian sequence {L(i)}i> is independent of the dither 

sequence {e n i(i)}. 

Persistence condition: To obtain convergence, we assume that the gains a(i) satisfy the following. 

a(i) > 0, ^a(i) = oo, ^a 2 (i)<oo (20) 

Condition (20) assures that the gains decay to zero, but not too fast. It is standard in stochastic adaptive signal 
processing and control; it is also used in consensus with noisy communications in [34], [19]. 

Markov property. Denote the natural filtration of the process X = {x(i)} i>0 by {J 7 *} i>Q - Because the dither 
random variables v n l{i)> ^ < n,l < N, are independent of JT X at any time i > 0, and, correspondingly, the noises 
T(i) and \I>(i) are independent of x(i), the process X is Markov. 

III. Consensus With Quantized Data: Unbounded Quantized States 

We consider that the dynamic range of the initial sensor data, whose average we wish to compute, is not known. 
To avoid quantizer saturation, the quantizer output takes values in the countable alphabet (7), and so the channel 
quantizer has unrestricted dynamic range. This is the quantizer consensus (QC) with unbounded range algorithm. 
Section IV studies quantization with unbounded range, i.e., the quantized consensus finite-bit (QCF) algorithm 
where the channel quantizers take only a finite number of output values (finite-bit quantizers). 

We comment briefly on the organization of the remaining of this section. Subsection III-A proves the a.s. con- 
vergence of the QC algorithm. We characterize the performance of the QC algorithm and derive expressions 
for the mean-squared error in Subsection III-B. The tradeoff between m.s.e. and convergence rate is studied in 
Subsection III-C. Finally, we present generalizations to the approach in Subsection III-D. 

'in case the network is static, i.e., the connectivity graph is time-invariant, all the results in the paper apply with L(i) = L, Vi. 
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A. QC Algorithm: Convergence 

We start with the definition of the consensus subspace C given as 

C = {x £ R Nxl \x = al, a G K} (21) 

We note that any vector x £ l w can be uniquely decomposed as 

x = x c + x c ± (22) 

and 

||x|| 2 = !|x c || 2 + ||x ci || 2 (23) 

where x c £ C and x c ± belongs to C ± , the orthogonal subspace of C. We show that (16), under the model in 
Subsection II-C, converges a.s. to a finite point in C. 
Define the component-wise average as 

We prove the a.s. convergence of the QC algorithm in two stages. Theorem 2 proves that the state vector sequence 
{x(i)} i>0 converges a.s. to the consensus subspace C. Theorem 3 then completes the proof by showing that the 
sequence of component-wise averages, {^avg(*)}i>o conver g es a - s - to a finite random variable 9. The proof of 
Theorem 3 needs a basic result on convergence of Markov processes and follows the same theme as in [19]. 

Stochastic approximation: Convergence of Markov processes. We state a slightly modified form , suitable to 

our needs, of a result from [43]. We start by introducing notation, following [43], see also [19]. 
Let X = {x(i)} i>0 be Markov in l Wxl . The generating operator C is 

CV (i, x) = E [V (i + 1, x(i + 1)) | x(i) = x] - V (i, x) a.s. (25) 

for functions V(i,x), i > 0, x £ M JVxl , provided the conditional expectation exists. We say that V(i,x) £ Dc 
in a domain A, if £V(i,x) is finite for all (i,x) £ A. 

Let the Euclidean metric be p(-). Define the e-neighborhood of B C R Nxl and its complementary set 

U t (B) = \x inf p(x,y)<e\ (26) 
V e (B) = R Nxl \U e (B) (27) 

Theorem 1 (Convergence of Markov Processes) Let: X be a Markov process with generating operator C; V(i, x) £ 
Dc a non-negative function in the domain i > 0, x £ R Nxl , and B C E JVxl . Assume: 

1) Potential function: inf V(i,x) > 0, Ve > (28) 

i>0,xGV e (B) 

V{i,x) = 0, x £ B (29) 

lim supF(i,x) = (30) 

2) Generating operator: £V(i,x) < g(i)(l + V(i,x)) — a(i)ip(i,x) (31) 
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where ip(i,x),i > 0, x G R is a non-negative function such that 



inf ip(i,x) > 0, Ve > 

i,xeVe(S) 



(32) 
(33) 



(34) 



a(i) > 0, ^a(z) = oo 

i>0 

Then, the Markov process X = {x(i)} i>0 with arbitrary initial distribution converges a.s. to B as i — > oo 

P f lim p(x(i),B) = O) = 1 (35) 
Proof: For proof, see [43], [19]. ■ 

Theorem 2 (a.s. convergence to consensus subspace) Consider the quantized distributed averaging algorithm given 
in eqns. (16). Then, for arbitrary initial condition, x(0), we have 



lim p(x(i),C) = 



1 



(36) 



Proof: The proof uses similar arguments as that of Theorem 3 in [19]. So we provide the main steps here 
and only those details which are required for later development of the paper. 

The key idea shows that the quantized iterations satisfy the assumptions of Theorem 1. Define the potential 
function, V(i,x), for the Markov process X as 



V(*,x) =x T Lx 
Then, using the properties of L and the continuity of V (i,x), 



V(i,x) = 0, x e C and lim supV(i,x) = 



(37) 



(38) 



For xer xl ,we clearly have p(x,C) = ||x c _l ||. Using the fact that x^ Lx > A 2 (£)||x c _l || 2 it then follows 

inf V(i,x)> inf A 2 (L) ||x c _l. II 2 > A 2 (~L) e 2 > (39) 

since A 2 (X) > 0. This shows, together with (38), that V(i,x) satisfies (28)-(30). 

Now consider £V(i,x). We have using the fact that L(i)x = L(i)x c ± and the independence assumptions 

£V(*,x) = E (x(i) -a(i)Lx(i) - a(*)Z(*)x(i) - a(*)T(i) - a(i)*(i)) T I (x(i) - a(i)Lx(i) 

- a(i)L(i)x(i) - a(i)Y(i) - a(i)*(i)) x(i) = x - x T Lx 
< -2a(i)x T L 2 x + a 2 (z)A^(I)|[x ci || 2 + a 2 ( J )A A r(I)E [a 2 m (Z(*))] ||x c ^ || 2 
+2a 2 (i)A JV (Z)(E [||T( J )|| 2 ]) 1/2 (E [||*(*)|| 2 ]) 1/2 + a 2 ( t )X N (L)E [||Y(i)|| 2 ] 
W«A w (L)E[||*(z)|| 2 l (40) 
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Since x T Lx > \2(L)\\x e ± | 2 , the eigenvalues of L(i) are not greater than 2N in magnitude, and from (17) get 



CV(i,x) < -2a(i)x T Ix- 



a 2 (i)\%{L) 4a 2 (i)N 2 X N (L) 



< -a(i)<p{i,x)+g(j)[l + V(i,x)] 



A 2 (L) 



x J Lx- 



2a 2 (i)|A^|A 2 AAr(L) 



where 



■ . t T 2 ,~ 2,., f ^n(L) , 4iV 2 A,v(L) 2|M|A 2 A Ar (L) 
</?(i, x) = 2x L x, ,g(i) = a (i) max 1 



X 2 (L) A 2 (L) 3 
Clearly, £F(i,x) and tp(i,x),g(i) satisfy the remaining assumptions (31)— (34) of Theorem 1; hence, 



lim p(x(i),C) = 



1 



(41) 



(42) 



(43) 



The convergence proof for QC will now be completed in the next Theorem. 

Theorem 3 (Consensus to finite random variable) Consider (16), with arbitrary initial condition x(0) G 
the state sequence {x(i)} i>0 . Then, there exists a finite random variable 9 such that 

P [ lim x(z) = 0ll = 1 

,i—>oc 

Proof: Define the filtration {Ti} i>0 as 

Ti = o {x(0), {L(j)} < j<j , {T(j)} < jKi , {*(j)}o<i<i} 
We will now show that the sequence {x- dvg (i)} i>0 is an ^-bounded martingale w.r.t. {-Fj}i>o. In fact, 

x wg {i + 1) = x. dvg (i) - a(i)T(i) - a{i)^(i) 
where T(i) and ^(i) are the component- wise averages given by 

T(i) = ^l T T( ? ), *(») = ^1 T *« 

Then, 

E[a; avg (i + l)|^] = a;,vg(i)-a(i)E[T(i)|^i] -a(i)E[*(*)|^i] 
= x mg (i) - a(i)E [T(i)] - a(i)E [tf (i)] 

= X aV g(?) 

where the last step follows from the fact that T(«) is independent of Tu and 

E[*(i)|^i] = E[*(i)|x(i)] 
= 



and 



(44) 



(45) 



(46) 



(47) 



(48) 



(49) 



because is independent of x(i) as argued in Section II-B. 
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Thus, the sequence {x- dvg (i)} i>0 is a martingale. For proving C 2 boundedness, note 

E[a&g(i + 1)] = E[.T avg W-a(z)T(i)-a(z)*(«)] 2 



(50) 



E [xl s {i)] + a 2 (i)E hf 2 (z)l + a 2 (i)E [* 2 (z)j + 2a 2 («)E [T(*)*(i 



< E[a4g(i)] +a 2 (i)E T z (z) 



(*)E [* 2 (i)j + 2a 2 (i) (E [t 2 (z)J ) V2 (e [* 2 (z)j ) 



1/2 



Again, it can be shown by using the independence properties and (17) that 

|M|A 2 



E 



T 2 (i) 



= E 



V(i) 



< 



6iV 2 



where M is the number of realizable edges in the network (eqn. (19)). It then follows from eqn. (50) that 

2a 2 (i)\M\A 2 



E [xl g (i + 1)] < E [x 2 vg (i)] 



3iV 2 



Finally, the recursion leads to 



(51) 



(52) 



(53) 



Note that in this equation, x 2 vg (0) is bounded since it is the average of the initial conditions, i.e., at time 0. Thus 
{x avg (i)} i>0 is an £ 2 -bounded martingale; hence, it converges a.s. and in C 2 to a finite random variable 9 ([44]). 
In other words, 



lim 2 aV g(«) = 



= 1 



(54) 



Again, Theorem 2 implies that as i — > oo we have x(i) — > a; aV g(i)l a.s. This and (54) prove the Theorem. ■ 
We extend Theorems 2,3 to derive the mean squared (m.s.s.) consensus of the sensor states to the random variable 
6 under additional assumptions on the weight sequence {a(z)}i>o- 



Lemma 4 Let the weight sequence {a(i)}i>o be of the form: 



a(i) 



where a > and .5 < r < 1. Then the a.s. convergence in Theorem 3 holds in m.s.s. also, i.e., 

\2" 



iim e (x n {i) - ey 

Proof: The proof is provided in Appendix I. 



0, Vn 



(55) 



(56) 



B. QC Algorithm: Mean-Squared Error 

Theorem 3 shows that the sensors reach consensus asymptotically and in fact converge a.s. to a finite random 
variable 9. Viewing 9 as an estimate of the initial average r (see eqn. (1)), we characterize its desirable statistical 
properties in the following Lemma. 
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Lemma 5 Let 9 be as given in Theorem 3 and r, the initial average, as given in eqn. (1). Define 

C = E[9-rf (57) 

to be the m.s.e. Then, we have: 

1) Unbiasedness: E [6] = r 

2) M.S.E. Bound: C < E,->o ^ U) 

Proof: The proof follows from the arguments presented in the proof of Theorem 3 and is omitted. ■ 
We note that the m.s.e. bound in Lemma 5 is conservative. Recalling the definition of M(i), as the number of 
active links at time i (see eqn. (19)), we have (by revisiting the arguments in the proof of Theorem 3) 

9A 2 

(Note that the term J2j>o 0(2 C?) E [l-^WI 2 ] is well-defined as E [|M(i)| 2 ] < \M\ 2 , Vi.) In case, we have a fixed 
(non-random) topology, M(i) = A4, Vi and the bound in eqn. (58) reduces to the one in Lemma 5. For the 
asynchronous gossip protocol in [33], \M(i)\ = 1, Vi, and hence 

2A 2 

j>0 

Lemma 5 shows that, for a given A, ( can be made arbitrarily small by properly scaling the weight sequence, 
{a(i)}i>o- We formalize this. Given an arbitrary weight sequence, {a(i)}i>o, which satisfies the persistence 
condition (20), define the scaled weight sequence, {a s (*)}i>o, as 

a s (i) = sa(i), Vi>0 (60) 

where, s > 0, is a constant scaling factor. Clearly, such a scaled weight sequence satisfies the persistence condi- 
tion (20), and the m.s.e. ( s obtained by using this scaled weight sequence is given by 

j>0 

showing that, by proper scaling of the weight sequence, the m.s.e. can be made arbitrarily small. 

However, reducing the m.s.e. by scaling the weights in this way will reduce the convergence rate of the algorithm. 
This tradeoff is considered in the next subsection. 



C. QC Algorithm: Convergence Rate 

A detailed pathwise convergence rate analysis can be carried out for the QC algorithm using strong approximations 
like laws of iterated logarithms etc., as is the case with a large class of stochastic approximation algorithms. More 
generally, we can study formally some moderate deviations asymptotics ([45], [46]) or take recourse to concentration 
inequalities ([47]) to characterize convergence rate. Due to space limitations we do not pursue such analysis in this 
paper; rather, we present convergence rate analysis for the state sequence {x(z)}i>o in the m.s.s. and that of the 
mean state vector sequence. We start by studying the convergence of the mean state vectors, which is simple, yet 
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illustrates an interesting trade-off between the achievable convergence rate and the mean-squared error £ through 
design of the weight sequence {a(i)}i> - 

From the asymptotic unbiasedness of 9 we have 

lim E [x(i)] = rl (62) 

Our objective is to determine the rate at which the sequence {E [x(i)]}j> converges to rl. 

Lemma 6 Without loss of generality, make the assumption 

a(i) < — 7 =-^ -=-, Vi (63) 

( ' ~ X 2 (L)+X N (L)' 

(We note that this holds eventually, as the a(i) decrease to zero.) Then, 

||E [x(i)] - rl|| < (e-^^^osis*-! || E [ x ( )] - r l\\ (64) 

Proof: We note that the mean state propagates as 

E [x(t + 1)] = (I - a(i)L) E [x(i)] , (65) 

The proof then follows from [19] and is omitted. ■ 
It follows from Lemma 6 that the rate at which the sequence {E [x(i)]}j>o converges to rl is closely related to 
the rate at which the weight sequence, a(i), sums to infinity. On the other hand, to achieve a small bound ( on the 
m.s.e, see lemma 57 in Subsection III-B, we need to make the weights small, which reduces the convergence rate 
of the algorithm. The parameter s introduced in eqn. (60) can then be viewed as a scalar control parameter, which 
can be used to trade-off between precision (m.s.e.) and convergence rate. More specifically, if a cost structure is 
imposed on the consensus problem, where the objective is a function of the m.s.e. and the convergence rate, one 
may obtain the optimal scaling s minimizing the cost from the pareto-optimal curve generated by varying s. This 
is significant, because the algorithm allows one to trade off m.s.e. vs. convergence rate, and in particular, if the 
application requires precision (low m.s.e.), one can make the m.s.e. arbitrarily small irrespective of the quantization 
step-size A. It is important to note in this context, that though the algorithms in [22], [20] lead to finite m.s.e., the 
resulting m.s.e. is proportional to A 2 , which may become large if the step-size A is chosen to be large. 

Note that this tradeoff is established between the convergence rate of the mean state vectors and the m.s.e. of 
the limiting consensus variable 6. But, in general, even for more appropriate measures of the convergence rate, we 
expect that, intuitively, the same tradeoff will be exhibited, in the sense that the rate of convergence will be closely 
related to the rate at which the weight sequence, a(i), sums to infinity. We end this subsection by studying the 
m.s.s. convergence rate of the state sequence {x(i)}i> which is shown to exhibit a similar trade-off. 

Lemma 7 Let the weight sequence {a(i)}i>o be of the form: 

a (i) — t. Q , n (66) 

2 A 2 (L) 

where a > and .5 < r < 1. Then the m.s.s. error evolves as follows: For every < e < y^pT) ' tnere ex i sts 
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i £ > 0, such that, for all i > i e we have 

1 



E 



|x(i) — rl\ 



< 



+ 



A 2 (£) 
1 



H E;:! £ «(i) 



E 



(67) 



E 



A N (t) 



2|.M|A' 



Proof: The proof is provided in Appendix I. ■ 
From the above we note that slowing up the sequence {a(i)}i>o decreases the polynomial terms on the R.H.S. of 
eqn. (67), but increases the exponential terms and since the effect of exponentials dominate that of the polynomials 
we see a similar trade-off between m.s.e. and convergence rate (m.s.s.) as observed when studying the mean state 
vector sequence above. 



D. QC Algorithm: Generalizations 

The QC algorithm can be extended to handle more complex situations of imperfect communication. For instance, 
we may incorporate Markovian link failures (as in [19]) and time-varying quantization step-size with the same type 
of analysis. 

Markovian packet dropouts can be an issue in some practical wireless sensor network scenarios, where random 
environmental phenomena like scattering may lead to temporal dependence in the link quality. Another situation 
arises in networks of mobile agents, where physical aspects of the transmission like channel coherence time, channel 
fading effects are related to the mobility of the dynamic network. A general analysis of all such scenarios is beyond 
the scope of the current paper. However, when temporal dependence is manifested through a state dependent 
Laplacian (this occurs in mobile networks, formation control problems in multi-vehicle systems), under fairly 
general conditions, the link quality can be modeled as a temporal Markov process as in [19] (see Assumption 1.2 
in [19].) Due to space limitations of the current paper, we do not present a detailed analysis in this context and 
refer the interested reader to [19], where such temporally Markov link failures were addressed in detail, though in 
the context of unquantized analog transmission. 

The current paper focuses on quantized transmission of data and neglects the effect of additive analog noise. 
Even in such a situation of digital transmission, the message decoding process at the receiver may lead to analog 
noise. Our approach can take into account such generalized distortions and the main results will continue to hold. 
For analysis purposes, temporally independent zero mean analog noise can be incorporated as an additional term 
on the R.H.S. of eqn. (16) and subsequently absorbed into the zero mean vectors \I>(i), Y(i). Digital transmission 
where bits can get flipped due to noise would be more challenging to address. 

The case of time-varying quantization may be relevant in many practical communication networks, where because 
of a bit-budget, as time progresses the quantization may become coarser (the step-size increases). It may also arise 
if one considers a rate allocation protocol with vanishing rates as time progresses (see [48]). In that case, the 
quantization step-size sequence, {A(i)}i> is time-varying with possibly 



limsup A(i) = oo 



(68) 
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Also, as suggested in [27], one may consider a rate allocation scheme, in which the quantizer becomes finer as 
time progresses. In that way, the quantization step-size sequence, {A(i)}i> may be a decreasing sequence. 

Generally, in a situation like this to attain consensus the link weight sequence {a(i)}i>o needs to satisfy a 
generalized persistence condition of the form 

a(i) = 00, ^ a 2 (i)& 2 (i) < 00 (69) 

i>0 i>0 

Note, when the quantization step-size is bounded, this reduces to the persistence condition assumed earlier. We 
state without proof the following result for time-varying quantization case. 

Theorem 8 Consider the QC algorithm with time-varying quantization step size sequence {A(i)} i > and let the 
link weight sequence {a(«)}i>o satisfy the generalized persistence condition in eqn. (69). Then the sensors reach 
consensus to an a.s. finite random variable. In other words, there exists an a.s. finite random variable 8, such that, 



lim x n (i) = 9, Vn 



1 (70) 



Also, if r is the initial average, then 

E 



(^-o 2 ]<|^E« 2 « A2 ( j ) w 

i>0 



It is clear that in this case also, we can trade-off m.s.e. with convergence rate by tuning a scalar gain parameter s 
associated with the link weight sequence. 

IV. Consensus with Quantized Data: Bounded Initial Sensor State 

We consider consensus with quantized data and bounded range quantizers when the initial sensor states are 
bounded, and this bound is known a priori. We show that finite bit quantizers (whose outputs take only a finite 
number of values) suffice. The algorithm QCF that we consider is a simple modification of the QC algorithm 
of Section HI. The good performance of the QCF algorithm relies on the fact that, if the initial sensor states 
are bounded, the state sequence, {x(i)} i>0 generated by the QC algorithm remains uniformly bounded with high 
probability, as we prove here. In this case, channel quantizers with finite dynamic range perform well with high 
probability. 

We briefly state the QCF problem in Subsection IV-A. Then, Subsection IV-B shows that with high probability 
the sample paths generated by the QC algorithm are uniformly bounded, when the initial sensor states are bounded. 
Subsection IV-C proves that QCF achieves asymptotic consensus. Finally, Subsections IV-D and IV-E analyze its 
statistical properties, performance, and tradeoffs. 



A. QCF Algorithm: Statement 

The QCF algorithm modifies the QC algorithm by restricting the alphabet of the quantizer to be finite. It assumes 
that the initial sensor state x(0), whose average we wish to compute, is known to be bounded. Of course, even if the 
initial state is bounded, the states of QC can become unbounded. The good performance of QCF is a consequence 
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of the fact that, as our analysis will show, the states {x(i)} i>0 generated by the QC algorithm when started with 
a bounded initial state x(0) remain uniformly bounded with high probability. 

The following are the assumptions underlying QCF. We let the the state sequence for QCF be represented by 

1) Bounded initial state. Let b > 0. The QCF initial state x(0) = x n (0) is bounded to the set B known a priori 

B = {y E R Nxl | \y n \ <b < +00} (72) 

2) Uniform quantizers and finite alphabet. Each inter-sensor communication channel in the network uses a uniform 
|~log 2 (2p + 1)] bit quantizer with step-size A, where p > is an integer. In other words, the quantizer output 
takes only 2p + 1 values, and the quantization alphabet is given by 

Q = {IA I i = 0, ±1, ■ ■ ■ ,±p} (73) 

Clearly, such a quantizer will not saturate if the input falls in the range [(— p — 1/2) A, (p + 1/2) A); if the 
input goes out of that range, the quantizer saturates. 

3) Uniform i.i.d. noise. Like with QC, the {v n i(i)}i> i<« ;<Af are a sequence of i.i.d. random variables uniformly 
distributed on [-A/2, A/2). 

4) The link failure model is the same as used in QC. 

Given this setup, we present the distributed QCF algorithm, assuming that the sensor network is connected. The 
state sequence, {x(z)} i>0 is given by the following Algorithm. 



Algorithm 1: QCF 
Initialize 

x„(0) = x„(0), Vn; 

i = 0; 

begin 

while sup 1 <„< Ar sup ieQn(i) \ (xi(i) + v nl (i))\ < (p+l/2)A do 

x„{i + 1) = (1 - a(i)d n (i))x n (i) + a(i) T,i e n n (i) 9(^(0 + M*))> ^ 
L i = i + l; 

end 

Stop the algorithm and reset all the sensor states to zero 



The last step of the algorithm can be distributed, since the network is connected. 

B. Probability Bounds on Uniform Boundedness of Sample Paths of QC 

The analysis of the QCF algorithm requires uniformity properties of the sample paths generated by the QC 
algorithm. This is necessary, because the QCF algorithm follows the QC algorithm till one of the quantizers gets 
overloaded. The uniformity properties require establishing statistical properties of the supremum taken over the 
sample paths, which is carried out in this subsection. We show that the state vector sequence, {x(i)} i>0 , generated 
by the QC algorithm is uniformly bounded with high probability. The proof follows by splitting the sequence 
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{x(i)}i> as the sum of the sequences {x avg (i)} i > and {x c ±(i)}i> f° r which we establish uniformity results. 
The proof is lengthy and uses mainly maximal inequalities for submartingale and supermartingale sequences. 
Recall that the state vector at any time i can be decomposed orthogonally as 



x(i) = x. dvg (i)l +x c ±(i) 



(74) 



where the consensus subspace, C, is given in eqn. (21). We provide probability bounds on the sequences {^av g (*)} i>0 
and {x c ±(i)} i>a and then use an union bound to get the final result. 

The rest of the subsection concerns the proof of Theorem 12 which involves several intermediate lemmas as 
stated below, whose proofs are provided in Appendix II. 

We need the following result. 



Lemma 9 Consider the QC algorithm stated in Section II and let {x(i)}j> be the state it sequence generates. 
Define the function W(i,x),i > 0, x e M Wxl , as 



W(i,x) = (l + y(i,x))J][l+ 5 (j)] 

j>i 



(75) 



where V(i,x) = x T Lx and {g(j)}j>o is defined in eqn. (42). 2 Then, the process {W(i,x(i)}i> is a non-negative 
supermartingale with respect to the filtration {Ti}i>o defined in eqn. (45). 



The next Lemma bounds the sequence {x c ±(i)} 



i>o- 



Lemma 10 Let {x(i)} i>0 be the state vector sequence generated by the QC algorithm, with the initial state x(0) G 
x 1 . Consider the orthogonal decomposition: 



x(i) = .T avg («)l +x c _l(i), Vi 



Then, for any a > 0, we have 



sup||x ci (j)|| 2 > a 



< 



(l+x(0) r Lx(0))nj> o (l+g(j-)) 
l + a\ 2 (L) 



(76) 



(77) 



where {g{j)}j>o is defined in eqn. (42). 

Next, we provide probability bounds on the uniform boundedness of {x nvg (i)} i>0 . 

Lemma 11 Let {^av g (*)} i>0 be the average sequence generated by the QC algorithm, with an initial state x(0) € 
R Nxl . Then, for any a > 0, 

1/2 

, I n - - zll^z±. \ ' .,--!,: 

u avgV 



sup|x avg (j)| > a 



< 



*2*(o) + £,•>«, « 2 C?) 



a 



(78) 



2 The above function is well-defined because the term Y\i>i [1 + 9 0)1 ' s f™ te f° r an y h by the persistence condition on the weight sequence. 
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Theorem 12 Let {x(i)} i>0 be the state vector sequence generated by the QC algorithm, with an initial state x(0) G 
R Nxl . Then, for any a > 0, 



sup||x(j)|| > a 



< 



2^ vg (o) + ^E,>o« 2 0-) 



1/2 



+ 



(i + x(of£x(o)) n L > (i + g(i)) 

l + fA 2 (I) 



(79) 



where {g{j)}j>o is defined in eqn. (42). 



We now state as a Corollary the result on the boundedness of the sensor states, which will be used in analyzing 
the performance of the QCF algorithm. 

Corollary 13 Assume that the initial sensor state, x(0) € B, where B is given in eqn. (72). Then, if {x(i)} i>0 is 
the state sequence generated by the QC algorithm starting from the initial state, x(0), we have, for any a > 0, 



sup \x n (j)\ > a 

l<n<N,j>0 



< 



1/2 



2M,2 + i^E J >o^)j ; (1 + NX N (L)b*)n 3 >o(l + 9(3)) fun 
_ _ _ 1 . __ — _ (KO) 



l + %-A 2 (L) 



where {g{j)}j>o is defined in eqn. (42). 



C. Algorithm QCF: Asymptotic Consensus 

We show that the QCF algorithm, given in Subsection IV-A, converges a.s. to a finite random variable and the 
sensors reach consensus asymptotically. 

Theorem 14 (QCF: a.s. asymptotic consensus) Let {x(i)} i>0 be the state vector sequence generated by the QCF 
algorithm, starting from an initial state x(0) = x(0) G B. Then, the sensors reach consensus asymptotically a.s. In 
other words, there exists an a.s. finite random variable 6 such that 



lim x(i) = 91 



= 1 



(81) 



Proof: For the proof, consider the sequence {x(z)} i>0 generated by the QC algorithm, with the same initial 
state x(0). Let 9 be the a.s. finite random variable (see eqn. 43) such that 



It is clear that 



lim x(z) = 91 



9 on jsup^osup^^sup^n^) \x t (i) + v„i{i)\ < (p+|)A| 
otherwise 



(82) 



(83) 



In other words, we have 



9 = 91 I sup sup sup \xi(i) + v n i(i)\ < (p + -)A 

V i>0 l<n<N ZeQ„(i) 1 , 



(84) 



where I(-) is the indicator function. Since |sup i>0 sup 1<Il<JV sup ;e0n ^ \ x i{i) + v ni{i)\ < (p + 1/2)A j is a mea- 
surable set, it follows that 9 is a random variable. ■ 
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D. QCF: e-Consensus 

Recall the QCF algorithm in Subsection IV-A and the assumptions l)-4). A key step is that, if we run the QC 
algorithm using finite bit quantizers with finite alphabet Q as in eqn. (73), the only way for an error to occur is 
for one of the quantizers to saturate. This is the intuition behind the design of the QCF algorithm. 

Theorem 14 shows that the QCF sensor states asymptotically reach consensus, converging a.s. to a finite random 
variable 9. The next series of results address the question of how close is this consensus to the desired average r 
in (1). Clearly, this depends on the QCF design: 1) the quantizer parameters (like the number of levels 2p + 1 or 
the quantization step A); 2) the random network topology ; and 3) the gains a. 

We define the following performance metrics which characterize the performance of the QCF algorithm. 

Definition 15 (Probability of e-consensus and consensus-consistent) The probability of e-consensus is defined as 



Note that the argument G in the definition of T(-) emphasizes the influence of the network configuration, whereas 
b is given in eqn. (72). 

The QCF algorithm is consensus-consistent 3 iff for every G, 6, e > and < 5 < 1, there exists quantizer 
parameters p, A and weights {a(i)}i>o, such that 



Theorem 17 characterizes the probability of e-consensus, while Proposition 18 considers several tradeoffs between 
the probability of achieving consensus and the quantizer parameters and network topology, and, in particular, shows 
that the QCF algorithm is consensus-consistent. We need the following Lemma to prove Theorem 17. 

Lemma 16 Let 9 be defined as in Theorem 14, with the initial state x(0) = x(0) G B. The desired average, r, is 
given in (1). Then, for any e > 0, we have 



T(G, b, a, e,p, A) = P lim sup \x n (i) — r\ < e 



(85) 



«->°o i<„<at 



T(G,b,a,e,p,A) >1-S 



(86) 




E,> « 2 (i) 



nl/2 



P \0-r\ >e < 



pA 



(87) 



(l + jVA^fr 2 ) n j>0 (l + g(j)) 
1 + ^A 2 (L) 



(88) 



where {g(j)}j>o is defined in eqn. (42). 



Proof: The proof is provided in Appendix III. 
We now state the main result of this Section, which provides a performance guarantee for QCF. 



3 Consensus-consistent means for arbitrary e > 0, the QCF quantizers can be designed so that the QCF states get within an e-ball of r with 
arbitrary high probability. Thus, a consensus-consistent algorithm trades off accuracy with bit-rate. 
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Theorem 17 (QCF: Probability of e-consensus) For any e > 0, the probability of e-consensus T(G, b, a, e,p, A) is 
bounded below 

1/2 

2|M|A 2 



lim sup \x n {i) — r\ < e 

i^oo i< n <N 



> 1 - 



2N p + MM^ E ^ a 2 {]) 



j>0 



pA 



(l + N\ N (L)b*) n,>o(l+5(j)) 



1 + ^A 2 (L) 

where {g(j)}j>o is defined in eqn. (42). 

Proof: It follows from Theorem 14 that 

lim x n (i) = 9 a.s., VI < n < N 



(89) 



(90) 



The proof then follows from Lemma 16. ■ 
The lower bound on T(-), given by (89), is uniform, in the sense that it is applicable for all initial states x(0) € B. 
Recall the scaled weight sequence a s , given by eqn. (60). We introduce the zero-rate probability of e-consensus, 

T*(G,b,e,p,A) by 

T z (G,b,e,p,A) = UmT(G,b,a s ,e,p,A) (91) 

s— »0 

The next proposition studies the dependence of the e-consensus probability T(-) and of the zero-rate probability 
T z ((-) on the network and algorithm parameters. 

Proposition 18 (QCF: Tradeoffs) 1) Limiting quantizer. For fixed G, b, a, e, we have 



lim T{G,b,a,e,p,A) = l 

A^O, pA^oo 

Since, this holds for arbitrary e > 0, we note that, as A — > 0, pA — > oo, 



(92) 



lim x(z) = rl 



limP 



lim sup \x n (i) — r\ < e 



= lim 



lim T(G, b, a, e,p, A) 

A^0, pA^oo 



1 



In other words, the QCF algorithm leads to a.s. consensus to the desired average r, as A — > 0, pA — > oo. In 

particular, it shows that the QCF algorithm is consensus-consistent. 

2) zero-rate e-consensus probability. Then, for fixed G, b, e,p, A, we have 



i (2M> 2 ) 1/2 l + JVAjy(L)b 2 
pA _ i + £!AiA 2 (L) 



T*(G,6,e,p,A) > 



(93) 



3) Optimum quantization step-size A. For fixed G, 6, e,p, the optimum quantization step-size A, which maximizes 
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the probability of e-consensus, T(G, b, a, e,p, A), is given by 



2Nb 2 



4MA 2 
3N 



n 1/2 



2|M|A 2 
3N 2 e 2 



+ 



A*(G,b,a,e, 



p) = arg inf 

A>0 



pA 



(94) 



j>0 



+ 



(l + NX N (L)b 2 )U, >0 (l + 9(j)) 
1 + ^A 2 (L) 



where {g(j)}j>o is defined in eqn. (42). 



Proof: For item 2), we note that, as s — > 0, 



j>0 j>0 



The rest follows by simple inspection of eqn. (89). 



We comment on Proposition 18. Item 1) shows that the algorithm QCF is consensus-consistent, in the sense that 
we can achieve arbitrarily good performance by decreasing the step-size A and the number of quantization levels, 
2p + 1, appropriately. Indeed, decreasing the step-size increases the precision of the quantized output and increasing 
p increases the dynamic range of the quantizer. However, the fact that A — > but pA — ► oo implies that the rate 
of growth of the number of levels 2p + 1 should be higher than the rate of decay of A, guaranteeing that in the 
limit we have asymptotic consensus with probability one. 

For interpreting item 2), we recall the m.s.e. versus convergence rate tradeoff for the QC algorithm, studied in 
Subsection III-B. There, we considered a quantizer with a countably infinite number of output levels (as opposed 
to the finite number of output levels in the QCF) and observed that the m.s.e. can be made arbitrarily small by 
rescaling the weight sequence. By Chebyshev's inequality, this would imply, that, for arbitrary e > 0, the probability 
of e-consensus, i.e., that we get within an e-ball of the desired average, can be made as close to 1 as we want. 
However, this occurs at a cost of the convergence rate, which decreases as the scaling factor s decreases. Thus, 
for the QC algorithm, in the limiting case, as s — > 0, the probability of e-consensus (for arbitrary e > 0) goes 
to 1; we call "limiting probability" the zero-rate probability of e-consensus, justifying the m.s.e. vs convergence 
rate tradeoff. 4 Item 2) shows, that, similar to the QC algorithm, the QCF algorithm exhibits a tradeoff between 
probability of e-consensus vs. the convergence rate, in the sense that, by scaling (decreasing s), the probability of 
e-consensus can be increased. However, contrary to the QC case, scaling will not lead to probability of e-consensus 
arbitrarily close to 1, and, in fact, the zero-rate probability of e-consensus is strictly less than one, as given by 
eqn. (93). In other words, by scaling, we can make T(G, 6, a s ,e,p, A) as high as T Z (G, b, e,p, A), but no higher. 

We now interpret the lower bound on the zero-rate probability of e-consensus, T Z (G, b, e,p, A), and show that 
the network topology plays an important role in this context. We note, that, for a fixed number, N, of sensor nodes, 
the only way the topology enters into the expression of the lower bound is through the third term on the R.H.S. 



4 Note that, for both the algorithms, QC and QCF, we can take the scaling factor, s, arbitrarily close to 0, but not zero, so that, these limiting 
performance values are not achievable, but we may get arbitrarily close to them. 
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Then, assuming that, 



NX N (L)b 2 > 1 



V 



, 2 A 2 



A 2 (L) > 1 



2 



we may use the approximation 



1 + NX N (L)b 2 
1 + H^A 2 (L) 



( 



2Nb 2 \ \ N (L) 
p 2 A 2 J A 2 (I) 



(95) 



Let us interpret eqn. (95) in the case, where the topology is fixed (non-random). Then for all i, L(i) = L = L. 
Thus, for a fixed number, N, of sensor nodes, topologies with smaller X N (L)/\ 2 (L), will lead to higher zero-rate 
probability of e-consensus and, hence, are preferable. We note that, in this context, for fixed N, the class of non- 
bipartite Ramanujan graphs give the smallest Xn(L)/X 2 (L) ratio, given a constraint on the number, M, of network 
edges (see [9].) 

Item 3) shows that, for given graph topology G, initial sensor data, b, the link weight sequence a, tolerance e, 
and the number of levels in the quantizer p, the step-size A plays a significant role in determining the performance. 
This gives insight into the design of quantizers to achieve optimal performance, given a constraint on the number 
of quantization levels, or, equivalently, given a bit budget on the communication. 

In the next Subsection, we present some numerical studies on the QCF algorithm, which demonstrate practical 
implications of the results just discussed. 
E. QCF: Numerical Studies 

We present a set of numerical studies on the quantizer step-size optimization problem, considered in Item 3) of 
Proposition 18. We consider a fixed (non-random) sensor network of N = 230 nodes, with communication topology 
given by an LPS-II Ramanujan graph (see [9]), of degree 6. 5 We fix e at .05, and take the initial sensor data bound, 
b, to be 30. We numerically solve the step-size optimization problem given in (94) for varying number of levels, 
2p + 1. Specifically, we consider two instances of the optimization problem: In the first instance, we consider the 
weight sequence, a(i) = .01/ (i + 1), (s = .01), and numerically solve the optimization problem for varying number 
of levels. In the second instance, we repeat the same experiment, with the weight sequence, a(i) = .001/(« + 1), 
(s = .001). As in eqn. (94), A*(G, b, a s , e,p) denotes the optimal step-size. Also, let T*(G,b,a s ,e,p) be the 
corresponding optimum probability of e-consensus. Fig. 1 on the left plots T*(G, b,a s ,e,p) for varying 2p + 1 on 
the vertical axis, while on the horizontal axis, we plot the corresponding quantizer bit-rate BR = log 2 (2p +1). 
The two plots correspond to two different scalings, namely, s = .01 and s = .001 respectively. The result is in 
strict agreement with Item 2) of Proposition 18, and shows that, as the scaling factor decreases, the probability of 
e-consensus increases, till it reaches the zero-rate probability of e-consensus. 

Fig. 1 on the right plots A*(G, b, a s , e,p) for varying 2p+l on the vertical axis, while on the horizontal axis, we 
plot the corresponding quantizer bit-rate BR = log 2 (2p +1). The two plots correspond to two different scalings, 
namely, s = .01 and s — .001 respectively. The results are again in strict agreement to Proposition 18 and further 
show that optimizing the step-size is an important quantizer design problem, because the optimal step-size value is 
sensitive to the number of quantization levels, 2p + 1. 

5 This is a 6-regular graph, i.e., all the nodes have degree 6. 
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V. Conclusion 

The paper considers distributed average consensus with quantized information exchange and random inter-sensor 
link failures. We add dither to the sensor states before quantization. We show by stochastic approximation that, 
when the range of the quantizer is unbounded, the QC-algorithm, the sensor states achieve a.s. and m.s.s. consensus 
to a random variable whose mean is the desired average. The variance of this random variable can be made small 
by tuning parameters of the algorithm (rate of decay of the gains), the network topology, and quantizers parameters. 
When the range of the quantizer is bounded, the QCF-algorithm, a sample path analysis shows that the state vector of 
the QC-algorithm can be made to remain uniformly bounded with probability arbitrarily close to 1, This means that 
the QCF algorithm achieves e-consensus. We use the bounds that we derive for the probability of large excursions of 
the sample paths to formulate a quantizer design problem that trades between several quantizer parameters: number 
of bits (or levels), step size, probability of saturation, and error margin to consensus. A numerical study illustrates 
this design problem and several interesting tradeoffs among the design parameters. 

Appendix I 
Proofs of Lemmas 4 and 7 

Before deriving Lemmas 4 and 7, we present a result from [49] on a property of real number sequences to be 
used later, see proof in [49]. 



Lemma 19 (Lemma 18 in [49]) Let the sequences {ri(t)}t>o and {r2{t)} t >o be given by 

a\ , A a 2 



n(t) = 



r 2 (t) 



(96) 



{t+l) s ^ " w (t+l)fc 
where 0,1,0,2,82 > and Q < £1 < 1. Then, if Si — 82 there exists K > such that, for non-negative integers, s < t, 



n (i-MO) 

=k+l 



r 2 (k) < K 



Moreover, the constant K can be chosen independently of s,t. Also, if 81 < 82, then, for arbitrary fixed s, 



lim y 



t-1 



11 (l-n(O) 



r 2 (k) = 



(97) 



(98) 



k—s 



.l=k+l 
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Proof: [Proof of Lemma 4] Taking expectations (unconditional) on both sides of eqn. (41) we have 

E[V(t + l,x(i + l))] < E[y(*,x(*))]-a(i)Efo>(i,x(i))] 
+ ff (i)[l + E[V(i,x(i))]] 

We also have the following inequalities for all i: 

\2(L) \\x c ± || 2 < V(i, x(i)) = xJj_ix C i < \n(L) ||x c ± |j 2 
A 2 © ll x c^H 2 < y(^ x (j)) =xjj.r X C i < A^(T) ||x C i|| 2 



From eqns. (99,99,100) we have 



E + 1, x(t + 1))] < ( 1 - M*)^7=7 + ff(0 ) E + 9(0 



(99) 
(100) 

(101) 



Choose < e < 



2A^(L) 
\ N (L) 



and note that, the form of g(i) in eqn. (42) and the fact that a(i) — > as i — > oo suggests 



that there exists i £ > 0, such that, ea(i) > g(i), i > i £ . We then have 



E [V(z + 1, x(i + 1))] < ^1 - (^2^=^ - ej <*(*) J E x(*))] + * > * £ 

Continuing the recursion we have for i > i £ , 

E x(i))] < J] (l - (2^^ - e) a(j)) E x(i e ))] 



(102) 



(103) 



i-l 



+ £ 



n (• 



■ e 



A„(L) 'J°«)W> 
E[y(i,x(i e ))] + ^ 



i-l 

n ( 



, Aj(L) 



where we use 1 — a < e a for a > 0. Since the a(i)s sum to infinity, we have 



lim e 

2 — >00 



The second term on the R.H.S. of (103) falls under Lemma 19 whose second part (eqn. (98)) implies 



i-l 



lim \^ 

4. — >nn ' ^ 



i-l 

n ( 

1=3+1 



'Xn(L) 



-e)a(l)) \g(j) 



= 



We conclude from eqn. (103) that lim^oo E [V(i, x(i))] = 0. This with (99) implies Hindoo E ||x c _l (i) 
From the orthogonality arguments we have for all i 



E 



|x(i) - 01 1 



E 



ixcw-0iir 



+ E 



|x c x(i) 



(104) 



(105) 



= 0. 



(106) 



The second term in eqn. (106) goes to zero by the above, whereas the first term goes to zero by the £ 2 convergence 
of the sequence {£ a vg(«)}i>o to 9 and the desired m.s.s. convergence follows. ■ 
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Proof: [Proof of Lemma 7] From (99,103), using repeatedly 1 — a < e a for a > 0, we have for i > i £ 



E 



||x c .|| 2 ] < -i=- E[y(»,x(»))] < -^=-e~ 

1 



E 

From the development in the proof of Theorem 3 we note that 



A 2 (£) 



E[V(i e ,x(* e ))] + 



E 



||xc(i)-rl|| ! 



A r2 E 



||xavg(i) -r\\ A 



We then arrive at the result by using the equality 



||x(i)-rl|| 2 =||x c .(i)|| 2 + ||x c (i)-rl|| 2 , Vz 

Appendix II 
Proofs of results in Subsection IV-B 

Proof: [Proof of Lemma 9] From eqn. (41) we have 

E[V(* + l,x(i + l))|x(i)] < -a(*)^(i,x(i)) 

+ .9«[l + T/(z,x(z))]+T^,x«) 

We then have 



(107) 



(108) 



(109) 



E[W(i + l,x(i + l))|J^] = E 



x(i) 



(i+^(i+i,x(*+i))) n 

j>i+l 

= n [i+5(j)](i+E[^+i,x(i+i))|x(i)]) 

j>i+l 

II [! + ffO")] ( x - "Wvfo x (*)) + ff(0 t 1 + v & x «)l + ^(i- x (*))) 



j>i+l 



= -a(i)<p(i,x(i)) J] [l + g(j)] + [l + V(i,x(i))}I[[l + 9m 

j>i+\ j>i 

= -a(i) V (*,x(i)) J] [l + ff 0')] + W(i,x(i)) 
j>i+l 

Hence E [W(i + l,x(i + 1)) | T t ] < W(i,x(i)) and the result follows. 
Proof: [Proof of Lemma 10] For any a > and i > 0, we have 

||x ci (z)|| 2 > a x(j) t Ix(j) > a\ 2 {L) 



(HO) 



(111) 



Define the potential function V («,x) as in Theorem 2 and eqn. (37) and the W (i,x) as in (75) in Lemma 9. It 
then follows from eqn. (Ill) that 

||x c ^)|| 2 >a => W(i,x(i)) > l + a\ 2 (L) (112) 

By Lemma 9, the process x(i)), .T 7 ,) is a non-negative supermartingale. Then by a maximal inequality for 

non-negative supermartingales (see [50]) we have for a > and i > 0, 

E[VK(0,x(0))] 



max W(j,x(j)) > a 

0<j<i 



< 



(113) 
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Also, we note that 



sup W(j,x(j)) > a \ U 4 > <^ max W(j,x(j)) > a \ 

j>0 J [0<j<i J 



(114) 



Since {ma,x < j<iW(j,x(j)) > a} is a non-decreasing sequence of sets in i, it follows from the continuity of 
probability measures and eqn. (112) 



sup||x ci (j)f > a 

J">0 



lim I 

i— >oo 



< lim P 



max ||x c _L(j)|| 2 > a 

0<j<i 



max W{j,x(j)) >l + a\ 2 (L) 

0<j<i 



(115) 



< lim E [W(0, X( ° ))] - ^ + x(0)TLx W) IW 1 + 9<j)) 



i^oo l + aA 2 (X) 



l + a\ 2 (L) 



Proof: [Proof of Lemma 11] It was shown in Theorem 3 that the sequence {x mg {i)} i>0 is a martingale. It 
then follows that the sequence, {\x- dvg (i)\} i>0 , is a non-negative submartingale (see [44]). 
The submartingale inequality then states that for a > 

E[|a: avg (i)|] 



max |xav g (j) > a 

0<j<i 



< 



Clearly, from the continuity of probability measures, 



sup|a; avg (j)| > a 



= lim P 

i— >oo 



max |x avg (j)| > a 

0>]>i 



Thus, we have 



sup|a; avg (j)| > a 



< lim 

i — >-oo 



E[\x avg (i)\] 



(the limit on the right exists because x avg (i) converges in C\.) Also, we have from eqn. (53), for all i, 

1/2 

V2 „ ... 2\M\A 2 



E[|x avg (z)|] < < E |a; avg (i)r 
Combining eqns. (118,119), we have 
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3>0 



SUp |x av g(j)l > a 
J>0 
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^vgW + ^E^o^O') 
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Proof: [Proof of Theorem 12] Since, ||x(j)|j 2 = Nx 2 (i) + ||x c ±(j)|| , we have 



sup||x(j)|| 2 > a 



< 



supiV|x aV g(j)r > o 

SUpNav g 0')l>(^) 1/2 



sup||x c x(j)|| 2 > X 



sup ||x c ^(i)|| 2 > ~z 



We thus have from Lemmas 10 and 11, 

,2 ,n\ , 2\M\A 



sup||x(j)|[ 2 > a 

3>0 
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Proof: [Proof of Corollary 13] We note that, for x(0) e B, 

x a 2 vg (0) < b 2 , x(0) T Lx(0) < NX N (L)b 2 

From Theorem 12, we then get, 

< 



(123) 



sup \x n (j)\ > a 

l<n<N,j>0 



< 



< 



sup||x(j)|| > a 



2A^ vg (0) + ^£ j > a 2 (j) 



1/2 
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(l + x(0) r Lx(0)) n 3 >o(l+g(j)) 
1 + f A 2 (I) 



1/2 



2AT6» + ^E J >o« 2 (j) " (l + ^^jn^oa + ffC?)) 
1 - — 



1 + f A 2 (L) 



Appendix III 
Proofs of Lemma 16 

Proof: [Proof of Lemma 16] For the proof, consider the sequence {x(i)} i>0 generated by the QC algorithm, 
with the same initial state x(0). Let 9 be the a.s. finite random variable (see eqn. 43) such that 



Urn x(i) = 01 



= 1 



We note that 





6-r 


> e 


= P 


[( 


9-r 















- r 



< P[|0-r|>e] 
From Chebyshev's inequality, we have 



(\e-r\>e)n(e = e^ +p ( 



> € 

> e 



)n(e^e) 



E 



r > e 



\e-r\ 



3N 2 e 2 



j>0 



Next, we bound 



e^e 



. To this end, we note that 



sup sup sup \xi{i) + v n i(i)\ < sup sup sup |x;(i)|+sup sup sup \v n i(i)\ 

i>0 l<n<N len„(i) i>0 l<n<N len n (i) i>a l<n<N ien„(i) 

< sup sup |x„(i)|+sup sup sup \t>ni(i)\ 

i>0 l<n<N i>0 l<n<N ZeQ„(i) 

, .. A 

< sup sup \x n (i)\ + — 

i>0 l<n<N ^ 
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Then, for any S > 0, 
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sup sup sup + > (p + i ) A 

i>0 l<n<N ien n (i) \ * 

sup sup \x„(i)\ + ^ > (p + ^ J A 

i>0 Kn<N 1 \ 1 / 
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sup sup |a; n (i)| > pA 

i>0 l<n<AT 

2^ 2 + ^E,>o« 2 (i) 
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sup sup |a; n (i)| > pA — S 

i>0 Kn<N 



1/2 



pA - 5 
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(l + NX N (L)b 2 ) n,>o(l + 9(3)) 



1 + 



A 2 (£) 



where, in the last step, we use eqn. (124.) Since the above holds for arbitrary S > 0, we have 
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Ar,2 , 4|A4|A 2 

2iVtr + ' 
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pA-S 
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^A 2 (I) 
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1/2 



pA 

Combining eqns. (125,126,128), we get the result. 
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Abstract — The paper studies the problem of distributed 
average consensus in sensor networks with quantized 
data and random link failures. To achieve consensus, 
dither (small noise) is added to the sensor states before 
quantization. When the quantizer range is unbounded 
(countable number of quantizer levels), stochastic approx- 
imation shows that consensus is asymptotically achieved 
with probability one and in mean square to a finite 
random variable. We show that the mean-squared error 
(m.s.e.) can be made arbitrarily small by tuning the link 
weight sequence, at a cost of the convergence rate of 
the algorithm. To study dithered consensus with random 
links when the range of the quantizer is bounded, we 
establish uniform boundedness of the sample paths of 
the unbounded quantizer. This requires characterization 
of the statistical properties of the supremum taken over 
the sample paths of the state of the quantizer. This is 
accomplished by splitting the state vector of the quantizer 
in two components: one along the consensus subspace and 
the other along the subspace orthogonal to the consensus 
subspace. The proofs use maximal inequalities for sub- 
martingale and supermartingale sequences. From these, 
we derive probability bounds on the excursions of the 
two subsequences, from which probability bounds on the 
excursions of the quantizer state vector follow. The paper 
shows how to use these probability bounds to design the 
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quantizer parameters and to explore tradeoffs among the 
number of quantizer levels, the size of the quantization 
steps, the desired probability of saturation, and the desired 
level of accuracy e away from consensus. Finally, the paper 
illustrates the quantizer design with a numerical study. 

Keywords: Consensus, quantized, random link 
failures, stochastic approximation, convergence, bounded 
quantizer, sample path behavior, quantizer saturation 
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I. Introduction 

This paper is concerned with consensus in networks, 
e.g., a sensor network, when the data exchanges among 
nodes in the network (sensors, agents) are quantized. 
Before detailing our work, we briefly overview the 
literature. 

Literature review. Consensus is broadly understood 
as individuals in a community achieving a consistent 
view of the World by interchanging information regard- 
ing their current state with their neighbors. Considered 
in the early work of Tsitsiklis et. al. ([?], [?]), it has 
received considerable attention in recent years and arises 
in numerous applications including: load balancing, [?], 
alignment, flocking, and multi-agent collaboration, e.g., 
[?], [?], vehicle formation, [?], gossip algorithms, [?], 
tracking, data fusion, [?], and distributed inference, [?]. 
We refer the reader to the recent overviews on consensus, 
which include [?], [?]. 

Consensus is a distributed iterative algorithm where 
the sensor states evolve on the basis of local interactions. 
Reference [?] used spectral graph concepts like graph 
Laplacian and algebraic connectivity to prove conver- 
gence for consensus under several network operating 
conditions (e.g., delays and switching networks, i.e., 
time varying). Our own prior work has been concerned 
with designing topologies that optimize consensus with 
respect to the convergence rate, [?], [?]. Topology design 
is concerned with two issues: 1) the definition of the 
graph that specifies the neighbors of each sensor — 
i.e., with whom should each sensor exchange data; and 
2) the weights used by the sensors when combining the 
information received from their neighbors to update their 
state. Reference [?] considers the problem of weight 
design, when the topology is specified, in the frame- 
work of semi-definite programming. References [?], [?] 
considered the impact of different topologies on the 
convergence rate of consensus, in particular, regular, ran- 
dom, and small-world graphs, [?]. Reference [?] relates 



the convergence properties of consensus algorithms to 
the effective resistance of the network, thus obtaining 
convergence rate scaling laws for networks in up to 
3-dimensional space. Convergence results for general 
problems in multi-vehicle formation has been consid- 
ered in [?], where convergence rate is related to the 
topological dimension of the network and stabilizability 
issues in higher dimensions are addressed. Robustness 
issues in consensus algorithms in the presence of analog 
communication noise and random data packet dropouts 
have been considered in [?]. 

Review of literature on quantized consensus. Dis- 
tributed consensus with quantized transmission has been 
studied recently in [?], [?], [?], [?] with respect to 
time-invariant (fixed) topologies. Reference [?] consid- 
ers quantized consensus for a certain class of time- 
varying topologies. The algorithm in [?] is restricted 
to integer-valued initial sensor states, where at each 
iteration the sensors exchange integer-valued data. It is 
shown there that the sensor states are asymptotically 
close (in their appropriate sense) to the desired average, 
but may not reach absolute consensus. In [?], the noise 
in the consensus algorithm studied in [?] is interpreted as 
quantization noise and shown there by simulation with 
a small network that the variance of the quantization 
noise is reduced as the algorithm iterates and the sen- 
sors converge to a consensus. References [?], [?] study 
probabilistic quantized consensus. Each sensor updates 
its state at each iteration by probabilistically quantizing 
its current state (which [?] claims equivalent to dithering) 
and linearly combining it with the quantized versions of 
the states of the neighbors. They show that the sensor 
states reach consensus a.s. to a quantized level. In [?] a 
worst case analysis is presented on the error propagation 
of consensus algorithms with quantized communication 
for various classes of time-invariant network topologies, 
while [?] addresses the impact of more involved encod- 
ing/decoding strategies, beyond the uniform quantizer. 
The effect of communication noise in the consensus 
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process may lead to several interesting phase transition 
phenomena in global network behavior, see, for example, 
[?] in the context of a network of mobile agents with a 
non-linear interaction model and [?], which rigorously 
establishes a phase transition behavior in a network of 
bipolar agents when the communication noise exceeds 
a given threshold. Consensus algorithms with general 
imperfect communication (including quantization) in a 
certain class of time-varying topologies has been ad- 
dressed in [?], which assumes that there exists a window 
of fixed length, such that the union of the network graphs 
formed within that window is strongly connected. From 
a distributed detection viewpoint, binary consensus algo- 
rithms over networks of additive white Gaussian noise 
channels were addressed in [?], which proposed soft 
information processing techniques to improve consensus 
convergence properties over such noisy channels. The 
impact of fading on consensus is studied in [?]. 

Contributions of this paper. We consider consen- 
sus with quantized data and random inter-sensor link 
failures. This is useful in applications where limited 
bandwidth and power for inter-sensor communications 
preclude exchanges of high precision (analog) data as 
in wireless sensor networks. Further, randomness in the 
environment results in random data packet dropouts. To 
handle quantization, we modify standard consensus by 
adding a small amount of noise, dither, to the data 
before quantization and by letting the consensus weights 
to be time varying, satisfying a persistence condition- 
their sum over time diverges, while their square sum is 
finite. We will show that dithered quantized consensus 
in networks with random links converges. 

The randomness of the network topology is cap- 
tured by assuming that the time-varying Laplacian se- 
quence, {L(i)}i>o, which characterizes the communica- 
tion graph, is independent with mean L; further, to prove 
convergence, we will need the mean graph algebraic 
connectivity (first nonzero eigenvalue of L) X 2 (L) > 0, 
i.e., the network to be connected on the average. Our 



proofs do not require any distributional assumptions 
on the link failure model (in space). During the same 
iteration, the link failures can be spatially dependent, 
i.e., correlated across different edges of the network. The 
model we work with in this paper subsumes the erasure 
network model, where link failures are independent both 
over space and time. Wireless sensor networks motivate 
us since interference among the sensors communication 
correlates the link failures over space, while over time, 
it is still reasonable to assume that the channels are 
memoryless or independent. Note that the assumption 
A 2 (i) > does not require the individual random in- 
stantiations of L(i) to be connected; in fact, it is possible 
to have all the instantiations to be disconnected. This 
captures a broad class of asynchronous communication 
models, for example, the random asynchronous gossip 
protocol in [?] satisfies A2 (i) > and hence falls under 
this framework. 

The main contribution of this paper is the study of 
the convergence and the detailed analysis of the sample 
path of this dithered distributed quantized consensus 
algorithm with random link failures. This distinguishes 
our work from [?] that considers fixed topologies (no 
random links) and integer valued initial sensor states, 
while our initial states are arbitrarily real valued. To our 
knowledge, the convergence and sample path analysis 
of dithered quantized consensus with random links has 
not been carried out before. The sample path analysis 
of quantized consensus algorithms is needed because in 
practice quantizers work with bounded (finite) ranges. 
The literature usually pays thrift attention or simply 
ignores the boundary effects induced by the bounded 
range of the quantizers; in other words, although assum- 
ing finite range quantizers, the analysis in the literature 
ignores the boundary effects. Our paper studies carefully 
the sample path behavior of quantized consensus when 
the range of the quantizer is bounded. It computes, under 
appropriate conditions, the probability of large excursion 
of the sample paths and shows that the quantizer can be 
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designed so that with probability as close to 1 as desired 
the sample path excursions remain bounded, within an 
e-distance of the desired consensus average. Neither our 
previous work [?], which deals with consensus with 
noisy analog communications in a random network, nor 
references [?], [?], [?], which introduce a probabilistic 
quantized consensus algorithm in fixed networks, nor [?], 
which studies consensus with analog noisy communica- 
tion and fixed network, study the sample path behavior 
of quantized consensus. Also, while the probabilistic 
consensus in [?], [?], [?] converges almost surely to 
a quantized level, in our work, we show that dithered 
consensus converges a.s. to a random variable which can 
be made arbitrarily close to the desired average. 

To study the a.s. convergence and m.s.s. conver- 
gence of the dithered distributed quantizers with random 
links and unbounded range, the stochastic approximation 
method we use in [?] is sufficient. In simple terms, 
we associate, like in [?], with the quantized distributed 
consensus a Lyapounov function and study the behavior 
of this Lyapounov function along the trajectories of the 
noisy consensus algorithm with random links. To show 
almost sure convergence, we show that a functional of 
this process is a nonnegative supermartingale; conver- 
gence follows from convergence results on nonnegative 
supermartingales. We do this in Section III where we 
term the unbounded dithered distributed quantized con- 
sensus algorithm with random links simply Quantized 
Consensus, for short, or QC algorithm. Although the 
general principles of the approach are similar to the ones 
in [?], the details are different and not trivial-we mini- 
mize the overlap and refer the reader to [?] for details. 
A second reason to go over this analysis in the paper for 
the QC algorithm is that we derive in this Section for QC 
several specific bounds that are used and needed as inter- 
mediate results for the sample path analysis that is car- 
ried out in Section IV when studying dithered quantized 
consensus when the quantizer is bounded, i.e., Quantized 
Consensus with Finite quantizer, the QCF quantizer. The 



QCF is a very simple algorithm: it is QC till the QC 
state reaches the quantizer bound, otherwise an error is 
declared and the algorithm terminated. To study QCF, 
we establish uniform boundedness of the sample paths 
of the QC algorithm. This requires establishing the sta- 
tistical properties of the supremum taken over the sample 
paths of the QC. This is accomplished by splitting the 
state vector of the quantizer in two components: one 
along the consensus subspace and the other along the 
subspace orthogonal to the consensus subspace. These 
proofs use maximal inequalities for submartingale and 
supermartingale sequences. From these, we are able to 
derive probability bounds on the excursions of the two 
subsequences, which we use to derive probability bounds 
on the excursions of the QC. We see that to carry out 
this sample path study requires new methods of anal- 
ysis that go well beyond the stochastic approximation 
methodology that we used in our paper [?], and also 
used by [?] to study consensus with noise but fixed 
networks. The detailed sample path analysis leads to 
bounds on the probability of the sample path excursions 
of the QC algorithm. We then use these bounds to design 
the quantizer parameters and to explore tradeoffs among 
these parameters. In particular, we derive a probability 
of e-consensus expressed in terms of the (finite) number 
of quantizer levels, the size of the quantization steps, the 
desired probability of saturation, and the desired level of 
accuracy e away from consensus. 

For the QC algorithm, there exists an interesting trade- 
off between the m.s.e. (between the limiting random 
variable and the desired initial average) and the conver- 
gence: by tuning the link weight sequence appropriately, 
it is possible to make the m.s.e. arbitrarily small (irre- 
spective of the quantization step-size), though penalizing 
the convergence rate. To tune the QC-algorithm, we 
introduce a scalar control parameter s (associated with 
the time-varying link weight sequence), which can make 
the m.s.e. as small as we want, irrespective of how large 
the step-size A is. This is significant in applications 
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that rely on accuracy and may call for very small m.s.e. 
for being useful. More specifically, if a cost structure is 
imposed on the consensus problem, where the objective 
is a function of the m.s.e. and the convergence rate, one 
may obtain the optimal scaling s by minimizing the cost 
from the Pareto-optimal curve generated by varying s. 
These tradeoffs and vanishingly small m.s.e. contrasts 
with the algorithms in [?], [?], [?], [?], [?] where the 
m.s.e. is proportional to A 2 , the quantization step-size- 
if the step-size is large, these algorithms lead to a large 
m.s.e. 

Organization of the paper. We comment briefly on 
the organization of the main sections of the paper. Sec- 
tion II summarizes relevant background, including spec- 
tral graph theory and average consensus, and presents 
the dithered quantized consensus problem with the dither 
satisfying the Schuchman conditions. Sections III con- 
siders the convergence of the QC algorithm. It shows 
a.s. convergence to a random variable, whose m.s.e. is 
fully characterized. Section IV studies the sample path 
behavior of the QC algorithm through the QCF. It uses 
the expressions we derive for the probability of large 
excursions of the sample paths of the quantizer to con- 
sider the tradeoffs among different quantizer parameters, 
e.g., number of bits and quantization step, and the 
network topology to achieve optimal performance under 
a constraint on the number of levels of the quantizer. 
These tradeoffs are illustrated with a numerical study. 
Finally, Section V concludes the paper. 



II. Consensus with Quantized Data: Problem 
Statement 

We present preliminaries needed for the analysis of 
the consensus algorithm with quantized data. The set-up 
of the average consensus problem is standard, see the 
introductory sections of relevant recent papers. 



A. Preliminaries: Notation and Average Consensus 

The sensor network at time index i is represented 
by an undirected, simple, connected graph G(i) — 
(V,E(i)). The vertex and edge sets V and E(i), with 
cardinalities | Vj = N and \E(i)\ = M(i), collect the 
sensors and communication channels or links among 
sensors in the network at time i. The network topology 
at time i, i.e., with which sensors does each sensor 
communicate with, is described by the N x N discrete 
Laplacian L(i) = L T (i) = D(i) - A(i) > 0. The 
matrix A(i) is the adjacency matrix of the connectivity 
graph at time i, a (0,1) matrix where A n k(i) = 1 
signifies that there is a link between sensors n and 
k at time i. The diagonal entries of A(i) are zero. 
The diagonal matrix D{i) is the degree matrix, whose 
diagonal D nn (i) = d n (i) where d n (i) is the degree 
of sensor n, i.e., the number of links of sensor n at 
time i. The neighbors of a sensor or node n, collected 
in the neighborhood set Q n (i), are those sensors k for 
which entries A n k(i) ^ 0. The Laplacian is positive 
semidefinite; in case the network is connected at time 
i, the corresponding algebraic connectivity or Fiedler 
value is positive, i.e., the second eigenvalue of the 
Laplacian \2(L(i)) > 0, where the eigenvalues of L(i) 
are ordered in increasing order. For detailed treatment 
of graphs and their spectral theory see, for example, [?], 
[?], [?]. Throughout the paper the symbols P[-] and E[-] 
denote the probability and expectation operators w.r.t. 
the probability space of interest. 

Distributed Average Consensus. The sensors mea- 
sure the data x n (Q), n — 1, • • • ,N, collected in the 
vector x(0) = [xi(0) • • • 2;at(0)] t G R^* 1 . Distributed 
average consensus computes the average r of the data 

1 N 1 
r = x avg (0) = -5>„(0) = -x(0) T l (1) 

by local data exchanges among neighboring sensors. 
In (1), the column vector 1 has all entries equal to 1. 
Consensus is an iterative algorithm where at iteration i 
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each sensor updates its current state x n (i) by a weighted 
average of its current state and the states of its neighbors. 
Standard consensus assumes a fixed connected network 
topology, i.e., the links stay online permanently, the 
communication is noiseless, and the data exchanges are 
analog. Under mild conditions, the states of all sensors 
reach consensus, converging to the desired average r, see 
[?], [?], 

lim x(i) = rl (2) 

i—*oo 

where x(i) = [x\(i) ■ ■ -XN(i)] is the state vector that 
stacks the state of the TV sensors at iteration i. We 
consider consensus with quantized data exchanges and 
random topology (links fail or become alive at random 
times), which models packet dropouts. In [?], we studied 
consensus with random topologies and (analog) noisy 
communications. 

B. Dithered Quantization: Schuchman Conditions 

We write the sensor updating equations for consensus 
with quantized data and random link failures as 

x n (i+l) = [1 - a(i)d n (i)] x n (i)+a(i) f nl>i [xi(i)] 

!en n (») 

(3) 

where: a(i) is the weight at iteration i; and 
{fnl,i}i<n,l<N, i>o is a sequence of functions (possibly 
random) modeling the quantization effects. Note that 
in (3), the weights a(i) are the same across all links — 
the equal weights consensus, see [?] — but the weights 
may change with time. Also, the degree d n (i) and the 
neighborhood of each sensor n, n = 1, • • • ,N 

are dependent on i emphasizing the topology may be 
random time- varying. 

Quantizer. Each inter-sensor communication channel 
uses a uniform quantizer with quantization step A. We 
model the communication channel by introducing the 
quantizing function, q(-) : M — > Q, 

q(y) = kA, (fc-i)A<y< (fc+i)A (4) 



where y e M is the channel input. Writing 

q{y) = y + e(y) 



(5) 



where e(y) is the quantization error. Conditioned on the 
input, the quantization error e(y) is deterministic, and 



A A 

— 2<e(y)<-, Vy 



(6) 



We first consider quantized consensus (QC) with un- 
bounded range, i.e., the quantization alphabet 



Q = {fcA | k G Z} 



(7) 



is countably infinite. In Section IV. we consider what 
happens when the range of the quantizer is finite- 
quantized consensus with finite (QCF) alphabet. This 
study requires that we detail the sample path behavior 
of the QC-algorithm. 

We discuss briefly why a naive approach to consensus 
will fail (see [?] for a similar discussion.) If we use 
directly the quantized state information, the functions 
fni,i(-) in eqn. (3) are 



fnl,i{xi{i)) 



,l<n<N 



= q(xi(i)) 

= xi(i) + e(xi(i)) 



(8) 
(9) 



Equations (3) take then the form 



x n (i+l) 



(1 - a(i)d n (i))x n (i) + a(i) 



E 

;efin(i) 



xi{i) 
(10) 



+a(i) ^2 e ( x < 



The non-stochastic errors (the most right terms in (10)) 
lead to error accumulation. If the network topology 
remains fixed (deterministic topology,) the update in 
eqn. (10) represents a sequence of iterations that, as 
observed above, conditioned on the initial state, which 
then determines the input, are deterministic. If we choose 
the weights a(i)'s to decrease to zero very quickly, 
then (10) may terminate before reaching the consensus 
set. On the other hand, if the a(i)'s decay slowly, the 
quantization errors may accumulate, thus making the 
states unbounded. 
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In either case, the naive approach to consensus with 
quantized data fails to lead to a reasonable solution. This 
failure is due to the fact that the error terms are not 
stochastic. To overcome these problems, we introduce 
in 

a controlled way noise (dither) to randomize the sensor 
states prior to quantizing the perturbed stochastic state. 
Under appropriate conditions, the resulting quantization 
errors possess nice statistical properties, leading to the 
quantized states reaching consensus (in an appropriate 
sense to be defined below.) Dither places consensus 
with quantized data in the framework of distributed 
consensus with noisy communication links; when the 
range of the quantizer is unbounded, we apply stochastic 
approximation to study the limiting behavior of QC, as 
we did in [?] to study consensus with (analog) noise and 
random topology. Note that if instead of adding dither, 
we assumed that the quantization errors are independent, 
uniformly distributed random variables, we would not 
need to add dither, and our analysis would still apply. 

Schuchman conditions. The dither added to random- 
ize the quantization effects satisfies a special condition, 
namely, as in subtractively dithered systems, see [?], 
[?]. Let {y(i)}i>o and {is(i)}i>o be arbitrary sequences 
of random variables, and q(-) be the quantization func- 
tion (4). When dither is added before quantization, the 
quantization error sequence, {e(i)} i > , is 

e(i) = q(y(i) + v(i)) - (y(i) + v(i)) (11) 

If the dither sequence, {is(i)}i>o, satisfies the Schuch- 
man conditions, [?], then the quantization error se- 
quence, {e(i)}i>o, in (11) is i.i.d. uniformly distributed 
on [—A/2, A/2) and independent of the input sequence 
{y(i)}i>o (see [?], [?], [?]). A sufficient condition for 
{v{i)} to satisfy the Schuchman conditions is for it 
to be a sequence of i.i.d. random variables uniformly 
distributed on[— A/2, A/2) and independent of the input 
sequence {y(i)}i>o- In the sequel, the dither {f(i)}i>o 



satisfies the Schuchman conditions. Hence, the quantiza- 
tion error sequence, {e(z)}, is i.i.d. uniformly distributed 
on [—A/2, A/2) and independent of the input sequence 

{y(i)}i>a- 

C. Dithered Quantized Consensus With Random Link 
Failures: Problem Statement 

We now return to the problem formulation of consen- 
sus with quantized data with dither added. Introducing 
the sequence, {v n i{i)}i>o,i<n.i<N , of i.i.d. random vari- 
ables, uniformly distributed on [—A/2, A/2), the state 
update equation for quantized consensus is: 

x n (i+l) = (1 - a(i)d n (i)) x n (i)+a(i) ^ q [xi(i) + v nl (i)\ , 1 <n < 

(12) 

This equation shows that, before transmitting its state 
xi(i) to the n-th sensor, the sensor I adds the dither 
v n i{i), then the channel between the sensors n and I 
quantizes this corrupted state, and, finally, sensor n 
receives this quantized output. Using eqn. (11), the state 
update is 

x n (i+l) = (1 - a(i)d n ) x n (i)+a(i) ^ [x t (i) + v nl (i) + e ni (i)] 

(13) 

The random variables v n i{i) are independent of the state 
x(j), i.e., the states of all sensors at iteration j, for j < i. 
Hence, the collection {s n i(i)} consists of i.i.d. random 
variables uniformly distributed on [—A/2, A/2), and the 
random variable e n i(i) is also independent of the state 
x(i), j < i- 

We rewrite (13) in vector form. Define the random 
vectors, T(i) and G R Nxl with components 

T n (i) = - ]T p n i(i) (14) 

*n(*) = - Y, £ "'W (15) 

The the N state update equations in (13) become in 
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vector form 

x(i + 1) = x(i) - a(i) [L(*)x(i) + T(i) + (16) 

where T(i) and \F(i) are zero mean vectors, independent 
of the state x(i), and have i.i.d. components. Also, if | A4| 
is the number of realizable network links, eqns. (14) and 
(15) lead to 

E[||Y( i )|| 2 ]=E[||*( i )f]<L^! j ,> (17) 

Random Link Failures: We now state the assumption 
about the link failure model to be adopted throughout the 
paper. The graph Laplacians are 

L(i) = L + L(i), Vi>0 (18) 

where {L(i)}i>o is a sequence of i.i.d. Laplacian matri- 
ces with mean L = E [£(«)], such that A2 (L) > (we 
just require the network to be connected on the average.) 
We do not make any distributional assumptions on the 
link failure model. During the same iteration, the link 
failures can be spatially dependent, i.e., correlated across 
different edges of the network. This model subsumes 
the erasure network model, where the link failures are 
independent both over space and time. Wireless sensor 
networks motivate this model since interference among 
the sensors communication correlates the link failures 
over space, while over time, it is still reasonable to 
assume that the channels are memoryless or independent. 
We also note that the above assumption A2 > 
does not require the individual random instantiations of 
L(i) to be connected; in fact, it is possible to have all 
the instantiations to be disconnected. This enables us to 
capture a broad class of asynchronous communication 
models, for example, the random asynchronous gossip 
protocol analyzed in [?] satisfies A2 (L) > and 
hence falls under this framework. More generally, in the 
asynchronous set up, if the sensors nodes are equipped 
with independent clocks whose ticks follow a regular 
random point process (the ticking instants do not have 



an accumulation point, which is true for all renewal 
processes, in particular, the Poisson clock in [?]), and at 
each tick a random network is realized with A 2 > 
independent of the the networks realized in previous 
ticks (this is the case with the link formation process 
assumed in [?]) our algorithm applies. 1 

We denote the number of network edges at time % as 
M(i), where M(i) is a random subset of the set of all 
possible edges £ with \£\ = N(N -l)/2. Let M denote 
the set of realizable edges. We then have the inclusion 

M(t) cMc£, Vi (19) 

It is important to note that the value of M(i) de- 
pends on the link usage protocol. For example, in the 
asynchronous gossip protocol considered in [?], at each 
iteration only one link is active, and hence M(i) = 1. 

Independence Assumptions: We assume that the 
Laplacian sequence {L(i)}i> is independent of the 
dither sequence {s n i{i)}. 

Persistence condition: To obtain convergence, we 
assume that the gains a(i) satisfy the following. 

a{i) > 0, y^aji) = 00, ^ct 2 (i)<oo (20) 

Condition (20) assures that the gains decay to zero, but 
not too fast. It is standard in stochastic adaptive signal 
processing and control; it is also used in consensus with 
noisy communications in [?], [?]. 

Markov property. Denote the natural filtration of the 
process X = {x(i)} i>0 by {J 7 *} i>0 - Because the dither 
random variables v n i(i), 1 < n,l < N, are independent 
of ^ x at any time i > 0, and, correspondingly, the 
noises T(i) and \F(i) are independent of x(i), the 
process X is Markov. 



'in case the network is static, i.e., the connectivity graph is time- 
invariant, all the results in the paper apply with L(i) = L, Vi. 
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III. Consensus With Quantized Data: 
Unbounded Quantized States 

We consider that the dynamic range of the initial 
sensor data, whose average we wish to compute, is 
not known. To avoid quantizer saturation, the quantizer 
output takes values in the countable alphabet (7), and so 
the channel quantizer has unrestricted dynamic range. 
This is the quantizer consensus (QC) with unbounded 
range algorithm. Section IV studies quantization with 
unbounded range, i.e., the quantized consensus finite- 
bit (QCF) algorithm where the channel quantizers take 
only a finite number of output values (finite-bit quantiz- 
ers). 

We comment briefly on the organization of the re- 
maining of this section. Subsection III-A proves the 
a.s. convergence of the QC algorithm. We characterize 
the performance of the QC algorithm and derive expres- 
sions for the mean-squared error in Subsection III-B. The 
tradeoff between m.s.e. and convergence rate is studied 
in Subsection III-C. Finally, we present generalizations 
to the approach in Subsection III-D. 

A. QC Algorithm: Convergence 

We start with the definition of the consensus subspace 
C given as 

C = {x e M Arxl |x = al, a e R} (21) 

We note that any vector x e R N can be uniquely 
decomposed as 

x = x c + x c ± (22) 



and 



Fc 



(23) 



We prove the a.s. convergence of the QC algorithm 
in two stages. Theorem 2 proves that the state vector 
sequence {x(i)} i>0 converges a.s. to the consensus 
subspace C. Theorem 3 then completes the proof by 
showing that the sequence of component-wise averages, 
{^avg(«)} i>0 converges a.s. to a finite random variable 
0. The proof of Theorem 3 needs a basic result on 
convergence of Markov processes and follows the same 
theme as in [?]. 

Stochastic approximation: Convergence of Markov 
processes. We state a slightly modified form , suitable to 
our needs, of a result from [?]. We start by introducing 
notation, following [?], see also [?]. 

Let X = {x(i)} 4>0 be Markov in R Nxl . The gener- 
ating operator C is 

CV (i, x) = E [V (i + 1, x(i + 1)) | x(i) = x] - V (i, x) a.s. 

(25) 

for functions V(i,x), i > 0, x G R Nxl , provided the 
conditional expectation exists. We say that V(i, x) 6 Dc 
in a domain A, if CV(i,x) is finite for all (i,x) 6 A. 

Let the Euclidean metric be p(-). Define the e- 
neighborhood of B C M. N x 1 and its complementary set 



U e (B) = 
Ve(B) = 



x 

Nxl 



inf p(x, y) < e 
yeB 

\Ue(B) 



(26) 
(27) 



Theorem 1 (Convergence of Markov Processes) Let: X 
be a Markov process with generating operator £; 
V(i,x) G Dc a non-negative function in the domain 

i > 0, x e R Nxl , and B C M JVxl . Assume: 



where x^ € C and x c ± belongs to C^~, the orthogonal 
subspace of C. We show that (16), under the model in 
Subsection II-C, converges a.s. to a finite point in C. 
Define the component-wise average as 

1 



1) Potential function: 



inf V(i,x) > 0, Ve > 
V(i,x) = 0, xeB 
lim sup V(i, x) = 



i>0 



Xwg(i) = jjl x(i) 



(24) 



2) Generating operator: 

where <p(i,x),i > 0, x e R Nxl 



CV(i,x) < ff (i)(l + V(*,x) 
is a non-negative 
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function such that Now consider £V(i,x). We have using the fact that 

L(i)x = L(i)x c ± and the independence assumptions 

inf <p(i,x) > 0, Ve > (32) 
i,xev e (B) v. _ 

a(i) > 0, ^a(i) = oo (33) CV ^^ = E (*(*)- a(i)Lx(i) - a(i)L(i)x(i) - a(i)Y(i) - a(i) 

x(i) = 



i>0 



- a(i)L(i)x(i) - a(i)T(i) - a(i)*(i) 

> 0, y) 5 (i)<oo (34) 

l >o < -2a(z)x T L x + a 2 (i)A^(L)||x ci || 2 + a 2 (i)AAr(L)E 



A; 

Then, the Markov process X = {x(i)} t > with arbitrary +2a 2 (i)X N (L) (E [||T(z)|| 2 ]) 1/2 (E [||*(i)|[ 2 ]) 1/2 + a 2 ( 

initial distribution converges a.s. to B as i — > oo +a 2 (i)Ajv(£)E [||\I>(i)|| 2 ] 

P (/i™ p(x(l) ' S) = °) = 1 (35) Since x T Lx > A 2 (L) ||x ci || 2 , the eigenvalues of L(i) 

are not greater than 2N in magnitude, and from (17) get 



Proof: For proof, see [?], [?]. 

r.w»:.x) < -2rvf7;w T T: 2 x 4- (— _ 

A 2 (L) A 2 (L) 

Theorem 2 (a.s. convergence to consensus subspace) ^ x) + g(i) [1 + V(i x)] 

Consider the quantized distributed averaging algorithm 
given in eqns. (16). Then, for arbitrary initial condition, where 

x(0), we have „ T _ 2 ... 2 ... / X% (L) AN 2 \ N (L) 2|7W|A 2 , 

^ X) = X ' * W = « « max + -^=p, -L_L_ 

lim /o(x(i),C) = ol = 1 (36) (42) 

Clearly, £V(£,x) and tp(z, x), <?(i) satisfy the remaining 
Protf- The proof uses similar arguments as that assumptions (3 l)-(34) of Theorem 1; hence, 
of Theorem 3 in [?]. So we provide the main steps 
here and only those details which are required for later 
development of the paper. 

The key idea shows that the quantized iterations sat- 
isfy the assumptions of Theorem 1 . Define the potential 
function, V(i,x), for the Markov process X as 



lim p(x(i),C) = 



= 1 (43) 



The convergence proof for QC will now be completed 
in the next Theorem. 



l/(i,x)=x J Lx (37) 

Then, using the properties of L and the continuity of Theorem 3 ( Consensus to finite random variable) 

y (i x ) Consider (16), with arbitrary initial condition 

x(0) G K. Wxl and the state sequence {x(i)} i>0 . 
V(i, x) = 0, x G C and lim sup V(i, x) = (38) Then there gxists a finke random yariable Q ^ ^ 

For x G R JVxl , we clearly have p(x,C) = ||x c _l ||. Using p n m x (j) = q\ = \ (44) 

Li — >QO 

the fact that x 1 Lx > A 2 (L)||x c ± || 2 it then follows 

Proof: Define the filtration {J 7 i} i>0 as 
inf V(i,x)> inf \ 2 (L)\\x c ± II 2 > A 2 (L) e 2 > 

(39) ^ = a{x(0),{L(j)} <. <i ,{T(j)} < J . <i ,{*(i)} < J . <i } 

since A 2 (I) > 0. This shows, together with (38), that (45) 

V(i,x) satisfies (28)-(30). We will now show that the sequence {x- dvg (i)} i>() is an 
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£2 -bounded martingale w.r.t. {^}i>o- In fact, 

a;av g (« + 1) = Zavg(i) - a(i)T(i) - a(i)$(i) (46) 

where T(i) and ^(i) are the component-wise averages 
given by 

T(i) 

Then, 



Finally, the recursion leads to 

2I7WIA 2 



E[xi g (i)] <xl g (0) 



3N 2 



3>0 



Note that in this equation, x 2 (0) is bounded since it 



is the average of the initial conditions, i.e., at time 0. 
— l T T(i), = — l T *(i) (47) Thus { x avg(*)} i>0 is an £ 2 -bounded martingale; hence, 

it converges a.s. and in C 2 to a finite random variable 9 
([?]). In other words, 



E[x mg (i + l)\Fi] = ara»g(i)-a(i)E[T(*)|^i] - a(i)E 



lim a; aV g(i) = 



= 1 



(54) 



Again, Theorem 2 implies that as i 



— x av g(i) 

where the last step follows from the fact that T(i) is 
independent of J 7 ,, and 



E [*(*) I = E [*(*) I x(i)] 
= 



(49) 



because is independent of x(i) as argued in Sec- 
tion II-B. 



00 we have 

— > x av g(«)l a.s. This and (54) prove the Theorem. 

■ 

We extend Theorems 2,3 to derive the mean squared 
(m.s.s.) consensus of the sensor states to the random 
variable 6 under additional assumptions on the weight 
sequence {a(i)}i> . 

Lemma 4 Let the weight sequence {a(i)}i>o be of the 
form: 



a(i) 



(55) 



(i + iy 

where a > and .5 < t < 1. Then the a.s. convergence 



Thus, the sequence {x avg (i)} i>Q is a martingale. For in Theorem 3 holds in m.s.s. also, i.e., 
proving £ 2 boundedness, note 



lim E 

i—*oo 



(x n (i) - ey 



0, Vn 



(56) 



E[a&g(i + 1)] = E[x ws {i)-a(i)T{i)-a{i)^{i)\ 



(50), 



= E[a&g(i)] +a 2 (i)E 
< E[x 2 vg (i)]+a 2 (i)E 



T 2 (i) 



Proof: The proof is provided in Appendix I. 



a 2 (i)E * (i) 



+ 2a 2 (i)E [T(i)*(i) 



B. QC Algorithm: Mean-Squared Error 

+a 2 (i)E [* 2 (i)j + 2a 2 (i) (e [t 2 (i)] )rheo^Sh[« 2 (^j)^s that the sensors reach consensus 

asymptotically and in fact converge a.s. to a finite 

Again, it can be shown by using the independence 

random variable 0. Viewing 6 as an estimate of the initial 

properties and (17) that 

average r (see eqn. (1)), we characterize its desirable 



E 



T 2 (i) 



= E 



V(0 



< 



|M|A 2 
6N 2 



(51) statistical properties in the following Lemma. 



where M is the number of realizable edges in the Lemma 5 Let q be as given in Theorem 3 and r, the 
network (eqn. (19)). It then follows from eqn. (50) that initial avera ge, as given in eqn. (1). Define 



E [x 2 mg (i + 1)] < E [* 2 vg «] + 2a2( ^ |A2 (52) 



( = E[6-rY 



(57) 
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to be the m.s.e. Then, we have: 

1) Unbiasedness: 

2) M.S.E. Bound: 

2 i-mi a2 v a 2 m 



E [6] = r 

C < 



Proof: The proof follows from the arguments 
presented in the proof of Theorem 3 and is omitted. ■ 

We note that the m.s.e. bound in Lemma 5 is conser- 
vative. Recalling the definition of M(i), as the number 
of active links at time i (see eqn. (19)), we have (by 
revisiting the arguments in the proof of Theorem 3) 



2A 



(58) 



j>0 



(Note that the term ]T\ >0 a 2 (j) E [|M(i)| 2 ] is well- 
defined as E [|M(i)| 2 ] < \M\ 2 , Vi.) In case, we have 
a fixed (non-random) topology, M(i) — M, Vi and the 
bound in eqn. (58) reduces to the one in Lemma 5. For 
the asynchronous gossip protocol in [?], \M(i)\ = 1, Vi, 
and hence 

9,A 2 _ „ 

(59) 



CgOSSip - Q (j) 



j>0 



Lemma 5 shows that, for a given A, £ can be made 
arbitrarily small by properly scaling the weight se- 
quence, {a(i)}i>o- We formalize this. Given an arbitrary 
weight sequence, {a(i)}i>o, which satisfies the persis- 
tence condition (20), define the scaled weight sequence, 
{a s (i)}i> , as 



a s{i) — sa(i), Vi > 



(60) 



where, s > 0, is a constant scaling factor. Clearly, 
such a scaled weight sequence satisfies the persistence 
condition (20), and the m.s.e. ( s obtained by using this 
scaled weight sequence is given by 

2|M|AV 



3iV 2 



(61) 



i>o 



showing that, by proper scaling of the weight sequence, 
the m.s.e. can be made arbitrarily small. 



However, reducing the m.s.e. by scaling the weights 
in this way will reduce the convergence rate of the algo- 
rithm. This tradeoff is considered in the next subsection. 

C. QC Algorithm: Convergence Rate 

A detailed pathwise convergence rate analysis can 
be carried out for the QC algorithm using strong ap- 
proximations like laws of iterated logarithms etc., as 
is the case with a large class of stochastic approxima- 
tion algorithms. More generally, we can study formally 
some moderate deviations asymptotics ([?],[?]) or take 
recourse to concentration inequalities ([?]) to charac- 
terize convergence rate. Due to space limitations we 
do not pursue such analysis in this paper; rather, we 
present convergence rate analysis for the state sequence 
{x(i)}i> m the m.s.s. and that of the mean state vector 
sequence. We start by studying the convergence of the 
mean state vectors, which is simple, yet illustrates an 
interesting trade-off between the achievable convergence 
rate and the mean-squared error £ through design of the 
weight sequence {a(i)}i>o- 

From the asymptotic unbiasedness of 9 we have 



lim E [x(i)] = rl 



(62) 



Our objective is to determine the rate at which the 
sequence {E [x(i)]}i> converges to rl. 

Lemma 6 Without loss of generality, make the assump- 
tion 

2 

a(i) < — 7=r -=-, Mi (63) 

y ' - \ 2 {l) + \ n {lY 

(We note that this holds eventually, as the a(i) decrease 
to zero.) Then, 

||E[x(t)] -rl|| < ( e - A2 ( x )(2:o< 3 < t -i«a))^ ||E[x(0)] - 

(64) 

Proof: We note that the mean state propagates as 

E [x(i + 1)] = (/ - a(i)L) E [x(i)] , Vi (65) 
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The proof then follows from [?] and is omitted. ■ 
It follows from Lemma 6 that the rate at which the 
sequence {E [x(z)]}j>o converges to rl is closely related 
to the rate at which the weight sequence, a(i), sums to 
infinity. On the other hand, to achieve a small bound 
( on the m.s.e, see lemma 57 in Subsection III-B, 
we need to make the weights small, which reduces 
the convergence rate of the algorithm. The parameter 
s introduced in eqn. (60) can then be viewed as a 
scalar control parameter, which can be used to trade- 
off between precision (m.s.e.) and convergence rate. 
More specifically, if a cost structure is imposed on the 
consensus problem, where the objective is a function of 
the m.s.e. and the convergence rate, one may obtain the 
optimal scaling s minimizing the cost from the pareto- 
optimal curve generated by varying s. This is significant, 
because the algorithm allows one to trade off m.s.e. vs. 
convergence rate, and in particular, if the application 
requires precision (low m.s.e.), one can make the m.s.e. 
arbitrarily small irrespective of the quantization step-size 
A. It is important to note in this context, that though the 
algorithms in [?], [?] lead to finite m.s.e., the resulting 
m.s.e. is proportional to A 2 , which may become large if 
the step-size A is chosen to be large. 

Note that this tradeoff is established between the con- 
vergence rate of the mean state vectors and the m.s.e. of 
the limiting consensus variable 9. But, in general, even 
for more appropriate measures of the convergence rate, 
we expect that, intuitively, the same tradeoff will be 
exhibited, in the sense that the rate of convergence 
will be closely related to the rate at which the weight 
sequence, a(i), sums to infinity. We end this subsection 
by studying the m.s.s. convergence rate of the state 
sequence {x(i)}i> which is shown to exhibit a similar 
trade-off. 

Lemma 7 Let the weight sequence {a(i)}i>o be of the 
form: 

a(i) = - ° (66) 
w {i + l) T 



where a > and .5 < r < 1. Then the m.s.s. error 

2A 2 (~L) 

evolves as follows: For every < e < there 
exists i £ > 0, such that, for all i > i s we have 




Proof: The proof is provided in Appendix I. ■ 
From the above we note that slowing up the sequence 
{a(i)}i>o decreases the polynomial terms on the R.H.S. 
of eqn. (67), but increases the exponential terms and 
since the effect of exponentials dominate that of the 
polynomials we see a similar trade-off between m.s.e. 
and convergence rate (m.s.s.) as observed when studying 
the mean state vector sequence above. 

D. QC Algorithm: Generalizations 

The QC algorithm can be extended to handle more 
complex situations of imperfect communication. For 
instance, we may incorporate Markovian link failures (as 
in [?]) and time-varying quantization step-size with the 
same type of analysis. 

Markovian packet dropouts can be an issue in some 
practical wireless sensor network scenarios, where ran- 
dom environmental phenomena like scattering may lead 
to temporal dependence in the link quality. Another situ- 
ation arises in networks of mobile agents, where physical 
aspects of the transmission like channel coherence time, 
channel fading effects are related to the mobility of the 
dynamic network. A general analysis of all such scenar- 
ios is beyond the scope of the current paper. However, 
when temporal dependence is manifested through a state 
dependent Laplacian (this occurs in mobile networks, 
formation control problems in multi-vehicle systems), 
under fairly general conditions, the link quality can be 
modeled as a temporal Markov process as in [?] (see 
Assumption 1.2 in [?].) Due to space limitations of the 
current paper, we do not present a detailed analysis in 
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this context and refer the interested reader to [?], where 
such temporally Markov link failures were addressed 
in detail, though in the context of unquantized analog 
transmission. 

The current paper focuses on quantized transmission 
of data and neglects the effect of additive analog noise. 
Even in such a situation of digital transmission, the 
message decoding process at the receiver may lead to 
analog noise. Our approach can take into account such 
generalized distortions and the main results will continue 
to hold. For analysis purposes, temporally independent 
zero mean analog noise can be incorporated as an addi- 
tional term on the R.H.S. of eqn. (16) and subsequently 
absorbed into the zero mean vectors TT(i). Digital 

transmission where bits can get flipped due to noise 
would be more challenging to address. 

The case of time-varying quantization may be relevant 
in many practical communication networks, where be- 
cause of a bit-budget, as time progresses the quantization 
may become coarser (the step-size increases). It may 
also arise if one considers a rate allocation protocol with 
vanishing rates as time progresses (see [?]). In that case, 
the quantization step-size sequence, {A(z)} i > is time- 
varying with possibly 



limsupA(i) = oo 



(68) 



Also, as suggested in [?], one may consider a rate 
allocation scheme, in which the quantizer becomes finer 
as time progresses. In that way, the quantization step-size 
sequence, {A(z)}i>o may be a decreasing sequence. 

Generally, in a situation like this to attain consensus 
the link weight sequence {a(i)}i> needs to satisfy a 
generalized persistence condition of the form 

^2 "00 = °°, « 2 m a2 oo < °° ( 69 ) 

i>0 i>0 

Note, when the quantization step-size is bounded, this 
reduces to the persistence condition assumed earlier. We 
state without proof the following result for time-varying 



quantization case. 

Theorem 8 Consider the QC algorithm with time- 
varying quantization step size sequence {A(i)}i>o an d 
let the link weight sequence {a(i)}i>o satisfy the gen- 
eralized persistence condition in eqn. (69). Then the 
sensors reach consensus to an a.s. finite random variable. 
In other words, there exists an a.s. finite random variable 
9, such that, 



lim x n (i) — 6>, Vn 



Also, if r is the initial average, then 



1 



E 



(70) 



(71) 



i>0 



It is clear that in this case also, we can trade-off m.s.e. 
with convergence rate by tuning a scalar gain parameter 
s associated with the link weight sequence. 

IV. Consensus with Quantized Data: Bounded 
Initial Sensor State 

We consider consensus with quantized data and 
bounded range quantizers when the initial sensor states 
are bounded, and this bound is known a priori. We show 
that finite bit quantizers (whose outputs take only a finite 
number of values) suffice. The algorithm QCF that we 
consider is a simple modification of the QC algorithm of 
Section III. The good performance of the QCF algorithm 
relies on the fact that, if the initial sensor states are 
bounded, the state sequence, {x(i)} i>0 generated by 
the QC algorithm remains uniformly bounded with high 
probability, as we prove here. In this case, channel 
quantizers with finite dynamic range perform well with 
high probability. 

We briefly state the QCF problem in Subsection IV-A. 
Then, Subsection IV-B shows that with high probability 
the sample paths generated by the QC algorithm are 
uniformly bounded, when the initial sensor states are 
bounded. Subsection IV-C proves that QCF achieves 
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asymptotic consensus. Finally, Subsections IV-D and IV- 
E analyze its statistical properties, performance, and 
tradeoffs. 

A. QCF Algorithm: Statement 

The QCF algorithm modifies the QC algorithm by 
restricting the alphabet of the quantizer to be finite. It 
assumes that the initial sensor state x(0), whose average 
we wish to compute, is known to be bounded. Of course, 
even if the initial state is bounded, the states of QC can 
become unbounded. The good performance of QCF is a 
consequence of the fact that, as our analysis will show, 
the states {x(i)} i>0 generated by the QC algorithm 
when started with a bounded initial state x(0) remain 
uniformly bounded with high probability. 

The following are the assumptions underlying QCF. 
We let the the state sequence for QCF be represented by 

{*Wh>o- 

1) Bounded initial state. Let b > 0. The QCF initial 
state x(0) = x n (Q) is bounded to the set B known 
a priori 

S = {y£ R Nxl | \y n \ <b< +00} (72) 

2) Uniform quantizers and finite alphabet. Each inter- 
sensor communication channel in the network uses 
a uniform [~log 2 (2p +1)] bit quantizer with step- 
size A, where p > is an integer. In other words, 
the quantizer output takes only 2p + 1 values, and 
the quantization alphabet is given by 

Q = {ZA I Z = 0, ±1, • - - ,±p} (73) 

Clearly, such a quantizer will not saturate if the in- 
put falls in the range [(-p - 1/2) A, (p + 1/2) A); 
if the input goes out of that range, the quantizer 
saturates. 

3) Uniform i.i.d. noise. Like with QC, the 
{^(*)}i>o,i<rU<JV a sequence of i.i.d. random 
variables uniformly distributed on [—A/2, A/2). 



4) The link failure model is the same as used in QC. 
Given this setup, we present the distributed QCF algo- 
rithm, assuming that the sensor network is connected. 
The state sequence, {x(i)} i>0 is given by the following 
Algorithm. 

Algorithm 1: QCF 
Initialize 

x n (0) = x n (0), Vn; 

i = 0; 

begin 

while sup 1 <„< A rSup ;e07i(l) \{xi{i) + v n i{i))\ < 
(p+l/2)A do 

x n (i + 1) = (1 - a(i)d n (i))x n (i) + 
"W Ez e n„(i) <l{xi{i) + Vnl(i)), Vn; 
L i = i + l; 

end 

Stop the algorithm and reset all the sensor states to 
zero 

The last step of the algorithm can be distributed, since 
the network is connected. 

B. Probability Bounds on Uniform Boundedness of Sam- 
ple Paths of QC 

The analysis of the QCF algorithm requires uniformity 
properties of the sample paths generated by the QC 
algorithm. This is necessary, because the QCF algorithm 
follows the QC algorithm till one of the quantizers gets 
overloaded. The uniformity properties require establish- 
ing statistical properties of the supremum taken over the 
sample paths, which is carried out in this subsection. We 
show that the state vector sequence, {x(i)} i>0 , generated 
by the QC algorithm is uniformly bounded with high 
probability. The proof follows by splitting the sequence 
{x(i)}i> as the sum of the sequences {x avg (i)}i> and 
{x c ±(i)}i> for which we establish uniformity results. 
The proof is lengthy and uses mainly maximal inequal- 
ities for submartingale and supermartingale sequences. 

Recall that the state vector at any time i can be 
decomposed orthogonally as 

x(i) = x avs (i)l + x c ± (i) (74) 
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where the consensus subspace, C, is given in eqn. (21). 
We provide probability bounds on the sequences 
{a; avg (i)}. >0 and {xc_L(i)} i>0 and then use an union 
bound to get the final result. 

The rest of the subsection concerns the proof of The- 
orem 12 which involves several intermediate lemmas as 
stated below, whose proofs are provided in Appendix II. 

We need the following result. 

Lemma 9 Consider the QC algorithm stated in Section II 
and let {x(i)}i> be the state it sequence generates. 
Define the function W(i,x),i > 0, x G M Wxl , as 

W(i, x) = (1 + V (i, x)) H [1 + 9(j)} (75) 

j>i 

where V(i,x) = x T Lx and {g(j)}j>o is defined in 
eqn. (42). 2 Then, the process {W(i, x(i)}j> is a non- 
negative supermartingale with respect to the filtration 
{Fi}i>o defined in eqn. (45). 

The next Lemma bounds the sequence {x c ±(i)} i>0 . 

Lemma 10 Let {x(i)} i>0 be the state vector sequence 
generated by the QC algorithm, with the initial state 
x(0) G M Arxl . Consider the orthogonal decomposition: 



x(i) = x avg (i)l + x c ±(i), Vz 



x(0)G 



pJVxl 



Then, for any a > 0, 



sup|z avg (j)| > a 



< 



^v g (o) + ^E,>o« 2 (j) 



1/2 



(78) 



Theorem 12 Let {x(i)} i>0 be the state vector sequence 
generated by the QC algorithm, with an initial state 
x(0) G R Nxl . Then, for any a > 0, 



sup || X 

j>0 



> a 



< 



1/2 



- + 



(1+X(( 



(79) 



where {g{j)}j>o is defined in eqn. (42). 



We now state as a Corollary the result on the bounded- 
ness of the sensor states, which will be used in analyzing 
the performance of the QCF algorithm. 

Corollary 13 Assume that the initial sensor state, x(0) G 
B, where B is given in eqn. (72). Then, if {x(i)} i>0 
is the state sequence generated by the QC algorithm 
starting from the initial state, x(0), we have, for any 

a > 0, 



sup \x n (J)\ > a 

l<n<N,j>0 



< 



2iV6 2 + ^JA !E 2 (i) 



1/2 



1 + J 



(80) 



(76) where {g{j)}j>o is defined in eqn. (42). 



Then, for any a > 0, we have 

(l + x(0fLx(0))n j >o(l 



sup||x ci (j)|| 2 > a 

3>Q 



< 



l + a\ 2 (L) 



(77) 



where {g(j)}j>o is defined in eqn. (42). 



C. Algorithm QCF: Asymptotic Consensus 

show that the QCF algorithm, given in Subsec- 
tion IV-A, converges a.s. to a finite random variable and 
the sensors reach consensus asymptotically. 



Next, we provide probability bounds on the uniform 
boundedness of {x- dvg (i)} . >Q . 

Lemma 11 Let {xmg(i)} i>0 be the average sequence 
generated by the QC algorithm, with an initial state 

2 The above function is well-defined because the term 
Ylj>i [1 + 9(j)} i s fi n i te f° r an y 3< b Y tne persistence condition on 
the weight sequence. 



Theorem 14 (QCF: a.s. asymptotic consensus) Let 
{x(i)} i>() be the state vector sequence generated 
by the QCF algorithm, starting from an initial state 
x(0) = x(0) G B. Then, the sensors reach consensus 
asymptotically a.s. In other words, there exists an a.s. 
finite random variable 9 such that 



lim x(i) = 91 



= 1 



(81) 
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Proof: For the proof, consider the sequence 
{x(i)} i>0 generated by the QC algorithm, with the same 
initial state x(0). Let 9 be the a.s. finite random variable 
(see eqn. 43) such that 



The probability of e-consensus is defined as 

T(G,b,a,e,p,A) = 



lim sup \x n (i) — r\ < e 

l<n<N 

(85) 



lim x(i) = 61 



1 



It is clear that 



Note that the argument G in the definition of T(-) 
(82) emphasizes the influence of the network configuration, 
whereas b is given in eqn. (72). 

The QCF algorithm is consensus-consistent 3 iff for 
on {sup j > sup 1 < n < jv sup, 6nn(i) MO + and < 6 < 1, there exists quantizer 

parameters p, A and weights {a(z)}i>n, such that 

T(G, 6, a,e,p,A)>l-S (86) 



otherwise 



(83) 



In other words, we have 



1, 



9 = 91 sup sup sup \xi(i) + v n i{i)\ < (p+ -)A 
V i>0 l<n<Af ien n (i) ^ i 

(84) 
Since 



where I(-) is the indicator function 

, su 

is a measurable set, it follows that 9 is a random variable 



Theorem 17 characterizes the probability of e-consensus, 
while Proposition 18 considers several tradeoffs between 
the probability of achieving consensus and the quantizer 



sup i > sup 1 < n < Ar sup ienn(i) \xi(i) +v n l{i)\ < (p+ l/2)H r | meters and network topology, and, in particular, 

shows that the QCF algorithm is consensus-consistent. 
We need the following Lemma to prove Theorem 17. 



D. QCF: e-Consensus 

Recall the QCF algorithm in Subsection IV-A and the 
assumptions l)-4). A key step is that, if we run the QC 
algorithm using finite bit quantizers with finite alphabet 
Q as in eqn. (73), the only way for an error to occur is 
for one of the quantizers to saturate. This is the intuition 
behind the design of the QCF algorithm. 

Theorem 14 shows that the QCF sensor states asymp- 
totically reach consensus, converging a.s. to a finite 
random variable 9. The next series of results address the 
question of how close is this consensus to the desired 
average r in (1). Clearly, this depends on the QCF 
design: 1) the quantizer parameters (like the number of 
levels 2p+ 1 or the quantization step A); 2) the random 
network topology ; and 3) the gains a. 

We define the following performance metrics which 
characterize the performance of the QCF algorithm. 

Definition 15 (Probability of e-consensus and consensus- 



Lemma 16 Let 9 be defined as in Theorem 14, with the 
initial state x(0) = x(0) G B. The desired average, r, is 
given in (1). Then, for any e > 0, we have 

P[|*-r|>ej < ±^Y, a U) + - 

j>0 

(l + NX N (L)b 2 )U 3 > ^+9(j)) 
+ 1 + ^A 2 (I) 

where {g(j)}j>o is defined in eqn. (42). 



Proof: The proof is provided in Appendix III. ■ 
We now state the main result of this Section, which 
provides a performance guarantee for QCF. 

Theorem 17 (QCF: Probability of e-consensus) For any 
e > 0, the probability of e-consensus T(G, 6, a, e,p, A) 

3 Consensus-consistent means for arbitrary e > 0, the QCF quantiz- 
ers can be designed so that the QCF states get within an e-ball of r 
with arbitrary high probability. Thus, a consensus-consistent algorithm 
CfflfadtSfefff^ccuracy with bit-rate. 
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is bounded below 



lim sup \x n (i) 



r < e 



> 1 



2|A4|A- 

3iV 2 e 2 



G, b, e,p, A, we have 

4|A4|A 



2Nb z + l| 
^ a 2 0f( G[i%%7A)1W 



pA 



1/2 
1 



NX N (L)b 2 



3>0 



pA 



1 + 



p 2 A 2 



" (93) 



(l + N\jy(L)b ) I^^2^tm^i}}^ uantization step-sim^\A. For fixed 

1 4- P 2 ^ 2 - ~^ " 



where {g(j)}j>o is defined in eqn. (42). 

Proof: It follows from Theorem 14 that 
lim z„(i) = 9 a.s., VI < n < N 



'e,p, the optimum quantization step-size A, 
which maximizes the probability of e-consensus, 
T(G, b, a, e,p, A), is given by 



(90) 



A*(G,b,a,e,p) 



arg inf 

A>o 



2\M\A 2 v 

j>0 



The proof then follows from Lemma 16. ■ 
The lower bound on T(-), given by (89), is uniform, 
in the sense that it is applicable for all initial states 
x(0) G B. Recall the scaled weight sequence a s , given 
by eqn. (60). We introduce the zero-rate probability of 
e-consensus, T Z (G, b, e,p, A) by 



T z (G,b,e,p,A) - lim T(G, b, a s ,e,p, A) 

s-tO 



(91) 



The next proposition studies the dependence of the e- 
consensus probability T(-) and of the zero-rate proba- 
bility T z {{-) on the network and algorithm parameters. 



Proposition 18 (QCF: Tradeoffs) 1) Limiting quan- 
tizer. For fixed G, b 7 a, e, we have 



lim T(G,b,a,e,p,A) = l 

A^O, pA-*oo 



(92) 



Since, this holds for arbitrary e > 0, we note that, as 

A 0, pA -> oo, 



where {g(j)}j>o is defined in eqn. (42). 

Proof: For item 2), we note that, as s — > 0, 

E^')^°' n( i+ ^))-i 

The rest follows by simple inspection of eqn. (89). ■ 
We comment on Proposition 18. Item 1) shows that 
the algorithm QCF is consensus-consistent, in the sense 
that we can achieve arbitrarily good performance by 
decreasing the step-size A and the number of quanti- 
zation levels, 2p + 1, appropriately. Indeed, decreasing 
the step-size increases the precision of the quantized 
output and increasing p increases the dynamic range 
of the quantizer. However, the fact that A — > but 
pA — > oo implies that the rate of growth of the number 
of levels 2p + 1 should be higher than the rate of decay 



lim x(i) = rl 



limP 



lim 

e^0 



In other words, the QCF algorithm leads to a.s. con- 
sensus to the desired average r, as A — > 0, pA — > 
oo. In particular, it shows that the QCF algorithm is 
consensus-consistent. 

2) zero-rate e-consensus probability. Then, for fixed 



of A, guaranteeing that in the limit we have asymptotic 

lim sup \x n (i) — r\ < e . , , , ... 

i^oo k„<jv coniensus with probability one. 

lim T(G b a e p Kf ll? ter P ret i n g i tem 2), we recall the m.s.e. versus 
convergence rate tradeoff for the QC algorithm, studied 
in Subsection III-B. There, we considered a quantizer 
with a countably infinite number of output levels (as 
opposed to the finite number of output levels in the QCF) 
and observed that the m.s.e. can be made arbitrarily 
small by rescaling the weight sequence. By Chebyshev's 



2Nb 2 



3N 



+ 



(1 
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inequality, this would imply, that, for arbitrary e > 0, 
the probability of e-consensus, i.e., that we get within 
an e-ball of the desired average, can be made as close 
to 1 as we want. However, this occurs at a cost of the 
convergence rate, which decreases as the scaling factor 
s decreases. Thus, for the QC algorithm, in the limiting 
case, as s — > 0, the probability of e-consensus (for 
arbitrary e > 0) goes to 1 ; we call "limiting probability" 
the zero-rate probability of e-consensus, justifying the 
m.s.e. vs convergence rate tradeoff. 4 Item 2) shows, 
that, similar to the QC algorithm, the QCF algorithm 
exhibits a tradeoff between probability of e-consensus 
vs. the convergence rate, in the sense that, by scaling 
(decreasing s), the probability of e-consensus can be 
increased. However, contrary to the QC case, scaling will 
not lead to probability of e-consensus arbitrarily close to 
1, and, in fact, the zero-rate probability of e-consensus 
is strictly less than one, as given by eqn. (93). In other 
words, by scaling, we can make T(G, b, a s , e,p, A) as 
high as T Z (G, b, e,p, A), but no higher. 

We now interpret the lower bound on the zero-rate 
probability of e-consensus, T Z (G, b, e,p, A), and show 
that the network topology plays an important role in 
this context. We note, that, for a fixed number, N, of 
sensor nodes, the only way the topology enters into the 
expression of the lower bound is through the third term 
on the R.H.S. Then, assuming that, 

NX N (L)b 2 > 1, ^-A 2 (I)>1 

we may use the approximation 

l + NX N (L)b 2 {2Nb 2 \ Ajy(L) 

l + ^iA 2 (I) ~ \P 2 & 2 ) ML) ( j 

Let us interpret eqn. (95) in the case, where the topology 
is fixed (non-random). Then for all i, L(i) = L = L. 
Thus, for a fixed number, N, of sensor nodes, topologies 

4 Note that, for both the algorithms, QC and QCF, we can take 
the scaling factor, s, arbitrarily close to 0, but not zero, so that, 
these limiting performance values are not achievable, but we may get 
arbitrarily close to them. 



with smaller Ajv(L)/A2(£), will lead to higher zero-rate 
probability of e-consensus and, hence, are preferable. 
We note that, in this context, for fixed N, the class 
of non-bipartite Ramanujan graphs give the smallest 
\n{L) / \%{L) ratio, given a constraint on the number, 
M, of network edges (see [?].) 

Item 3) shows that, for given graph topology G, initial 
sensor data, b, the link weight sequence a, tolerance e, 
and the number of levels in the quantizer p, the step- 
size A plays a significant role in determining the perfor- 
mance. This gives insight into the design of quantizers to 
achieve optimal performance, given a constraint on the 
number of quantization levels, or, equivalently, given a 
bit budget on the communication. 

In the next Subsection, we present some numerical 
studies on the QCF algorithm, which demonstrate prac- 
tical implications of the results just discussed. 

E. QCF: Numerical Studies 

We present a set of numerical studies on the quantizer 
step-size optimization problem, considered in Item 3) 
of Proposition 18. We consider a fixed (non-random) 
sensor network of N — 230 nodes, with communi- 
cation topology given by an LPS-II Ramanujan graph 
(see [?]), of degree 6. 5 We fix e at .05, and take the 
initial sensor data bound, b, to be 30. We numerically 
solve the step-size optimization problem given in (94) 
for varying number of levels, 2p + 1. Specifically, we 
consider two instances of the optimization problem: In 
the first instance, we consider the weight sequence, 
a(i) = .01/ (i + 1), (s = .01), and numerically solve the 
optimization problem for varying number of levels. In 
the second instance, we repeat the same experiment, with 
the weight sequence, a(i) = .001/(« + l), (s = .001). As 
in eqn. (94), A*(G, b, a s ,e,p) denotes the optimal step- 
size. Also, let T*(G, b, a s ,e,p) be the corresponding op- 
timum probability of e-consensus. Fig. 1 on the left plots 
T*(G, b, a s ,e,p) for varying 2p+ 1 on the vertical axis, 

5 This is a 6-regular graph, i.e., all the nodes have degree 6. 
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while on the horizontal axis, we plot the corresponding 
quantizer bit-rate BR = log 2 (2p + 1). The two plots 
correspond to two different scalings, namely, s — .01 and 
s = .001 respectively. The result is in strict agreement 
with Item 2) of Proposition 18, and shows that, as the 
scaling factor decreases, the probability of e-consensus 
increases, till it reaches the zero-rate probability of e- 
consensus. 

Fig. 1 on the right plots A* (G, b,a s ,e,p) for varying 
2p + 1 on the vertical axis, while on the horizontal 
axis, we plot the corresponding quantizer bit-rate BR = 
log 2 (2p + 1). The two plots correspond to two different 
scalings, namely, s = .01 and s = .001 respectively. The 
results are again in strict agreement to Proposition 18 and 
further show that optimizing the step-size is an important 
quantizer design problem, because the optimal step-size 
value is sensitive to the number of quantization levels, 
2p + l. 




V. Conclusion 

The paper considers distributed average consensus 
with quantized information exchange and random inter- 
sensor link failures. We add dither to the sensor states 
before quantization. We show by stochastic approxima- 
tion that, when the range of the quantizer is unbounded, 
the QC-algorithm, the sensor states achieve a.s. and 
m.s.s. consensus to a random variable whose mean is 
the desired average. The variance of this random variable 
can be made small by tuning parameters of the algorithm 
(rate of decay of the gains), the network topology, and 
quantizers parameters. When the range of the quantizer 
is bounded, the QCF-algorithm, a sample path analysis 
shows that the state vector of the QC-algorithm can 
be made to remain uniformly bounded with probability 
arbitrarily close to 1 . This means that the QCF algorithm 
achieves e-consensus. We use the bounds that we derive 
for the probability of large excursions of the sample 
paths to formulate a quantizer design problem that trades 
between several quantizer parameters: number of bits 
(or levels), step size, probability of saturation, and error 
margin to consensus. A numerical study illustrates this 
design problem and several interesting tradeoffs among 
the design parameters. 

Appendix I 
Proofs of Lemmas 4 and 7 

Before deriving Lemmas 4 and 7, we present a result 
from [?] on a property of real number sequences to be 
used later, see proof in [?]. 



Lemma 19 (Lemma 18 in [?]) Let the 
quences {ri(t)} t > and {r 2 (t)} t > be given by 



rx(t) = 



r 2 (t) = 



se- 



(96) 



where 01,02,^2 > and < 8% < 1. Then, if Si = 
5 2 there exists K > such that, for non-negative 
integers, s < t, 



Fig. 1. Left: T* (G, b, a a ,e,p) vs. 2p + 1 (BR = log 2 (2p + 1).) 
Right: A*(G,b,ct s ,e,p) vs. 2p + 1 (BR = log 2 (2p + 1).) 



n a-^o) 

l=k+l 



r 2 {k) < K (97) 



Moreover, the constant K can be chosen independently 
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of s,t. Also, if Si < 82, then, for arbitrary fixed s, 



lim > 



i-1 



E[ (l-ri(O) 



k=s U=k+1 



r 2 (k) = (98) 



The second term on the R.H.S. of (103) falls under 
Lemma 19 whose second part (eqn. (98)) implies 



lim y 



n 1 



=j+i 



|2 A|(L) 
Aiv(L) 



e a 



(0 



We 

lim,- 



conclude 



from 



eqn. 



(105) 
(103) that 



j E 

Proof: [Proof of Lemma 4] Taking expectations lim;-,™ E 



(unconditional) on both sides of eqn. (41) we have 



V(i,x(i))} = 0. This with (99) implies 



I x C-l(«) 



0. From the orthogonality 



arguments we have for all i 



E[V(* + l,x(i + l))] < E[y(i,x(i))]-a(i)E[ V (i,xOa||| x (i)_ei|| 
+ ff (*)[l+E[V(*,x(i))]] 

We also have the following inequalities for all i: 



E 



\xc{i)-61\ 



+E 



x c ±(i)| 
(106) 

The second term in eqn. (106) goes to zero by the above, 
whereas the first term goes to zero by the £2 convergence 
X 2 (L) \\x c ±\\ 2 < V(i,x(i)) = xJj_Lx C i < A A r( J L) ||x c ± || 2 G f t h e sequence {x avg (i)} t > to 6 and the desired m.s.s. 

(99) 

convergence follows. ■ 
Proof: [Proof of Lemma 7] From (99,103), using 



\\{L) ||x c ± || 2 < <p(i,x(i)) = x£ ± L 2 x c ± < \%(L) \\x c ± || 2 



(100) 



From eqns. (99,99,100) we have 



repeatedly 1 — a < e a for a > 0, we have for i > i £ 



<_L-E[V(i,x(i))] < 



i-1 



A 2 (L) Z - 



E[V(* + l,x(i + l))]< l-2a(i) -2^+0(0 )E[V(i,k(ij)]+g{i) A 2 (L) 
V Ajv(-k) / 

(101) 

2A 2 fZ/) 

Choose < e < x^jjj and note that, the form of g(i) in 
eqn. (42) and the fact that a(i) — » as i — > 00 suggests From the development in the proof of Theorem 3 we 
that there exists i £ > 0, such that, ea(i) > g(i), i > i e . note that 
We then have 



E[V(i + l,x(i + l))] < 1- 2 



Xn(L) 



E 



||x c (i)-rlf" 


= N 2 E 


||x av g(? 


)-r\\\ 




> is 







(108) 



1=0 



(102) 



Continuing the recursion we have for i > i £ , 



+ s 



e 




We then arrive at the result by using the equality 

||x(i)-rl|| 2 = ||x c x(i)|| 2 + ||x c (i)-rl|| 2 , 



£ )a(j) E[K(i,x(i e ))] 



W) 



Appendix II 



Proofs of results in Subsection IV-B 



E[v(iMQ)]+Y: I n (H 2 r7^- e ) a(i) ) )fc?) 



3 T¥ofy^yPrbof of Lemma 9] From eqn/(41) we 



where we use 1 — a < e a for a > 0. Since the a(i)s jjave 
sum to infinity, we have 



lim e 

'CO 



= 



E[V(i+l,x(i + l))|x(*)] < -a(iV(i,x(i)) 
(104) + ff (i)[l + V(i.x(i))] + V(i,x(i)) 
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We then have 

E[W(i+l,x(i + l))|^i] = E 



Uwof: [Proof j if Lei una 11] It was shown in The- 
(1 + V{%+1, x(i + 1))) rf 11 + gO)] At) 



< 



orem>?_|that the sequence [x avg (i)}. >0 is a martingale. It 

j>*+ 1 negative submartingale (see [?]). 

II I 1 + ffC?)] (l - "(^^fe^nU^eln^^et &fes X ^)or a 

j>»+i 

-a(iM*,x(i)) J] + 

j>i+l L°<J'<* ,->J 

-a(i)^(i, x(i)) JT [1 ^l^f+fW^t^^ntinuity of probability ^95 



a 



> 
(116) 



sures, 



j>i+i 

Hence E + 1, x(i + 1)) | Ti] < W(i, x(i)) and the 
result follows. ■ 

Proof: [Proof of Lemma 10] For any a > and 

i > 0, we have 



sup|x a vg(j)l > a 



Thus, we have 



sup|x avg (j)l > a 

J">0 



lim P 



< lim 



max |x avg (j)| > a 

0>]>i 



E[|x avg (i) 



(117) 



(118) 



(the limit on the right exists because x avg (i) converges 
|x c ^)|| 2 > a xf^IxW > a\ 2 (L) (111) ^ ^ } ^ wg have from gqn (53)> foj . ^ . 



Define the potential function V (i,x) as in Theorem 2 

and eqn. (37) and the W (i,x) as in (75) in Lemma 9. E [Kv g («)|] < < [E [|x avg (i)|' 
It then follows from eqn. (Ill) that 

\\x c ±(i)\\ 2 > a => W(i,x(i)) > l + oA 2 (L) (112) 



1/2 



< 



^av g (°) + ^2-E a 0) 



j>0 



Combining eqns. (118,119), we have 



By Lemma 9, the process (W(i, x(i)), Fi) is a non- 
negative supermartingale. Then by a maximal inequality 
for non-negative supermartingales (see [?]) we have for 
a > and i > 0, 

E[W(0,x(0))] 



Sup |x av g(j)l > a 
j>0 



< 



1/2 



(120) 



Prao/:- [Proof of Theorem 12] Since, ||x(j)|| 2 = 



max W(j,x(j)) > a 

0<]<i 



< 



(113) 



NxlJi) + \\xc±(j)\\, we have 



Also, we note that 



supW(j,x(j)) > a 



sup || x 

J>0 



> a 



< 



SUpiV|x av g(j)| 2 > - 
j>0 * 



sup ||X C J 

j>0 



Ui> { max W(j,x(j')) > a 

\0<j<i 

(114) 



SUp|Xavg(j)l > 



Since {m&x <j<iW(j,x(j)) > 0} is a non-decreasing 
sequence of sets in i, it follows from the continuity of 
probability measures and eqn. (112) 



We thus have from Lemmas 10 and 11, 



(2A/O 



1/2 



sup ||X C J 
J>0 



sup ||x 

j>0 



> a 



< 



^vg(o) + ^^E,>o« 2 0-) " (i + x(o) 3 



1/2 



'_o_\V2 
>2«/ 



sup ||X C _L 
j>0 



> a 



= lim I 
< lim 1 

i — >oo 



max ||x c x(i)|| > a 

0<]<i 



(122) 



(115) 



max W(j,x(j)) > 1 + ^f£)l Proof of Corollary 13] We note that, for 



0<j<i 



x(0) G B, J 

E[VK(0,x(0))] _ (l+x(0) T £x(0))n 3 >o(l + g(j)) 
- i^So l + a\ 2 (L) x^ vg ^^I)x(0) J Lx(0) < N\ N (L)b 2 (123) 
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From Theorem 12, we then get, 



sup \x n (j)\ > a 

l<n<N,j>0 



< 



< 



< 



Then, for any S > 0, 



sup|[x(j)|| > a 



1/2 



— - - - - + N>ui<^ + 7y ^ 2; 



sup sup sup + i/„z(i)| > (p+ - ) A 

i>0 l<n<N len n (i) 



Appendix III 
Proofs of Lemma 16 



1/2 



< 



sup sup |x„(i)| > 

i>0 Kn<N 



< 



2Nb 2 X-^m^ q 2(j )] 1/2 (l + jVAjy(I)&a) J 



pA-5 

where, in the last step, we use eqn. (124.) Since the 
above holds for arbitrary S > 0, we have 



1 + 



0^0 



< lim 

.510 



2 , 4|.M|A 2 



E 1>0 a2 (i) 



3JV ^i>0 



1/2 



pA-5 



+ 



(l + N\ N {L 

I7i 



2Ar6» + ^-Ej>o (1 + ^(1)^)1 



1/2 



pA 

Combining eqns. (125,126,128), we get the result. 



1 + 



Proof: [Proof of Lemma 16] For the proof, con- 
sider the sequence {x(i)} i>0 generated by the QC 
algorithm, with the same initial state x(0). Let 8 be the 
a.s. finite random variable (see eqn. 43) such that 



lim x(i) = 01 



= 1 



(124) 



We note that 

> e 



> ej n [6 = 
(\9-r\ >e)n (0 = 0) 
< F[\6-r\ >e]+p[#V 
From Chebyshev's inequality, we have 

2\M\A 2 



( - r > e) n (0 = 0) +P ( - r > ej n (o ^ 6} 



>ejn(8^ 



(125) 



3 



P[|0-r|>e] < 



Next, we bound 



i < 



37V 2 e 2 



7^ 



j>0 

To this end, we note that 



sup sup sup \xi(i) + v n i{i)\ < sup sup sup |xz(«)|+sup sup sup |v n «(*)l 

i>0 l<n<N lsn n (i) i>Q l<n<N len n (i) i>0 l<n< N le£l„ (i) 

< sup sup |x„(i)|+sup sup sup \v n i{i)\ 

i>0 l<n<N i>0 l<n<N len n (i) 

< sup sup |a; n (i)| + ^ (126) 

i>0 Kn<N * 



